How can the society as a whole maintain control of AI systems?

Transcript from October 13th Deep Dive: AI Society panel

Stefano Maffulli:

And welcome everyone. Welcome to Deep Dive: AI. This is an event series from the Open Source initiative. And we started with a podcast not too long ago exploring how artificial intelligence impacts Open Source software from developers, the businesses, to the rest of us. And today’s panel is part of the second phase of the exploration. We have panelists discussing the challenges and opportunities of AI for society as a whole. And we’re gonna have two more panels on the 18th and the 20th of October next week. But let’s start with the panelists today. I’m Stefano Maffulli Executive Director of the Open Source Initiative, and today I’m joined by, in no particular order, Luis Villa developer turned attorney who has worked on open source since the late nineties. Has advised clients ranging from startups to Amazon to Google, worked in house at Mozilla, WikiMedia and Tidelift, where he is their general council today. Served on The Open Source initiative, OpenETdata.org’s board, and he helped drafting quite a few open source licenses. Thanks Louis, for joining.

Luis Villa:

Too many open source licenses. Other than that, yes, I’m, I’m happy to be here.

Stefano Maffulli:

Well, nobody’s gonna say it’s her fault. I just said it. Kit Walsh. Kit Walsh, thank you for joining. She’s a Senior Staff Attorney and Assistant Director of the Electronic Frontier Foundation. She leads the EFF’s AI and Algorithm Justice working group, also specialized in copyright and free expression. Her recent work has touched on the rights of criminal defendants to challenge black box algorithms very much on point with the topic today. And she’s been also a public rights on the government agencies seeking to adopt an algorithm tool that will inform decisions about benefits such as housing and medical care. So, quite a lot of expertise in this topic. Thank you Kit for joining today.

Kit Walsh:

Yeah. Thank you for having me.

Stefano Maffulli:

Carlos Muñoz Ferrandis he’s a lawyer and PhD researcher focused on the interaction between open sourcing standards from an IP and the antitrust angle. He’s a tech and regulatory affairs council at Hugging Face, a startup I don’t know how to describe if without using trademarks from other companies, but it’s a place where you can find lots of models and data sets to exchange. And, he’s been driving the effort for responsible AI licensing. And Carlos is one of the drafters of the new set of licenses called Open RAIL Responsible AI licenses. He’s also a member of the AI network of experts of the OECD, the Observatory on AI for the Organization of Economic Cooperation and Development. And he is focusing on regulation and regulatory experimentation. Thanks, Carlos.

Carlos Muñoz Ferrandis:

Thanks a lot.

Stefano Maffulli:

And finally Kat Walsh, General Counsel of Creative Commons  and co-author of the version four of the Creative Common Licenses. She’s also working on public policy to enable better sharing and vibrant comments. Kat has been involved with the free and Open Source Software since the mid two thousands, including spent on the boards of WikiMedia and the Free Software Foundation and as council at tech startups. Kat. Thank you. Thank you for joining.

Stefano Maffulli:

So there are three main points that I’d like to cover today with you. One is understanding the difference between AI and other tools that we’ve known before. And the other is, how can society as a whole maintain control of AI systems and the balance, the powers of states and large corporations? And then trying to understand the good outcomes for this collaboration on ai and how do we get to a better AI, better AI systems, faster, quicker. So let’s start on the first topic, which is, so we hear often that AI poses unique challenges, something that from technologies that we’ve never seen before, and yet we have seen technologies that have dangerous uses and are dangerous, dangerous if put in the hands of our regular people from firearms or in gene editing all of these things, nuclear weapons, you know, nuclear power. What’s your take? What’s the difference between AI and what we’ve seen before?

Stefano Maffulli:

Luis, do you want to start?

Luis Villa:

I mean, I, I can go on, you know, I can just fill the whole 90 minutes just answering that question. But I think there’s a couple things, right? One law has been dealing with new technology for ages. There’s an entire literature on how steam trains changed law in the United States. There’s a whole nother literature on cars. Every US law student reads as one of the first cases they read in law school. They read a case about the wheel of a car breaking and how that changed US law. So in some sense, none of this is new, but in another sense, I think the velocity that we’re seeing here, and I think that’s one like key distinguishing factor. And one that I’d love to hear Kit in particular talk a little bit about. I think the mythologizing of artificial intelligence, right? It’s been in our, it’s been in our movies, it’s been in our science fiction long before it was part of our world. And I think that really impacts how the average person, the average regulator, thinks about it. And I think that mythology really complicates things in a way that hasn’t necessarily been the case with past technology, right? Like, nobody read books about the web before we invented CDA 230, right? And I think that, I think that impacts things a lot.

Stefano Maffulli:

Speed is definitely an issue. So, Kit?

Kit Walsh:

Yeah, I can, I can pick up from there. And then I have a couple more thoughts about ways that it’s different as a practical matter for this kind of group as we, as we think about it. I think in particular, one of the myths that I encounter, not just for AI, but any algorithmic decision making tool, is the idea that the machine is neutral and wise. And so as long as you are punting your decision about someone’s rights or what to do to a machine, it’s going to be fair. And, you know, we, we all cringe when we hear that, but it does require a lot of sort of educating lawmakers about how even just an algorithmic decision making tool even before you get to machine learning how those things, you know, embody biases, not just of the programmers, but also of the, the data and assumptions that go into them.

Kit Walsh:

Which leads me to something else that’s a little bit different about ai, which is the way that it’s development is so dependent on massive data collection and it raises a lot of new privacy related concerns, especially in terms of data that reflects private information about individuals or information like, you know, pictures of my face that I put on flicker in the odds and never thought once would be used to train a facial recognition system that police would use to imprison people. And then sort of another element that I think is pretty different is the explain ability piece and sort of how many different artifacts of machine learning development you need in order to have a shot at being able to explain how it’s arriving at those conclusions. And that especially I think is going to be relevant, you know, if the government tries to use a machine learning system in a way that does impact people’s rights part of your due process, right, as someone’s subject to those is that there be irrational explanation for why the government has made its decision about you.

Kat Walsh:

And you know, it may simply be that there are categories of tools that the government cannot lawfully use to make certain categories of decisions. And then there’s another category where there’s going to have to be a transparency and a process both before adoption and then on an individual basis as people are affected by it. And I think that some of those are, some of those are new for AI, some of them are just new for algorithmic decision making about people’s rights. But there are, there are definitely new things to think about. And then lastly copyright is different when you’re talking about you know, open source free software versus machine learning. This series has explored in, in detail sort of the different, the different aspects, the data, the model, the, the tool, the things that might be open licensed, the things that might not have any copyright in them that you can attach a license to. So you wind up, you know, in a toothless situation where a license might not be the right approach. And I think it is different for all those reasons. I’m curious to hear what Kat and Carlos have to say as well.

Carlos Muñoz Ferrandis:

Yeah, I think just to keep following the line of, of Kit, just to add a little more one more insight. I think it’s also about the difficulty or challenge of the prediction of the output of the ML system or the ML model. You never know, to be honest, when you are a commercial entity investing a lot of millions in training and retraining and fine tuning your machine learning commercial application, at the end of the day, you are going to sell this very specific service on a closed basis. So impliedly, you are already restricting open use to the public or access to this tool because you know that 99% of average probability for the tool to score right, is just for your specific case. So as a general rule, if I want to, let’s say open source or just broadly open an ML model or an ML system, an ML app it is hard as such to predict the output, right? So I really don’t know how the users are going to use these specific tool that I’m going to release.

Stefano Maffulli:

Yeah, no, that, that’s definitely something. So something different, you know, something new, this non-deterministic outcomes. What’s, what’s your take on the differences?

Kat Walsh:

I wanted to jump onto Kit’s explain ability point, and I think that’s a big difference. Like with other technologies such as like nuclear weapons or like handguns or automobiles, like a lot of, like what happens with them and like what can be predicted to happen with them can be explained by physics. You know, you get the, you have these inputs, you get these outputs. Like it’s easy to like understand like what they might do and how you might regulate them. And with AI systems, like it is much more opaque. Like you get a certain set of inputs, like how does that trace to the outputs? And I think a lot of people are using that in order to like distance themselves from the outputs to say like, oh, this is just, this is something else making the decision. It is not the, there is no person in control of it.

Kat Walsh:

There’s nowhere to place like liability or responsibility. And that’s you know, that’s one of those things. Another I think is just how accessible it is. Like there, you know, there’s a limited capability to develop these systems, but access is being opened up to a lot more people who are using them for things that were not necessarily thought of or may use them for different ends than they were intended to. And it’s, it’s much harder to replicate like a train or a nuclear weapon than it is to like, give people access to a piece of software. And you know, that allows for a lot of great positive possibilities, but also like all, all of the other possibilities that people can think of.

Stefano Maffulli:

Yeah, that’s definitely something that we’ve seen, right? The, it’s already emerging very clearly the faculty in in regulating the space, although there is clearly a need for it, or there is a push for regulation. Like we, within a few weeks we have seen another, a recommendation coming from Europe about the AI Act to proceed it’s path over the legislation. But also, was it last week that the bill of rights from the US government also published something, What, you know, what is that? Is that, are we ready to have regulation on this? What should we tell regulators at this stage? Slow down and wait. Go ahead, Carlos.

Carlos Muñoz Ferrandis:

Yeah. Thanks a lot. I think just to complete also catch, comment and, and close the line on, on explain ability or just keep going with a line of explain ability. I think it’s super interesting because then it’s about thinking what’s, what do we understand as access or as the concept of openness? It’s openness to the result of AI development, the main gem or the main core part. So basically having access to the product, to the ML model or it openness, taking more by dimensional perspective also has to focus on the process of developing or training these machine learning model. Because if you facilitate access to how do you build, how do you train a machine learning model, how do you explore, how do you choose the data, et cetera, et cetera, then you are giving society more or less the know-how on not just how to build machine learning related models, but also how to govern them, right? This is the approach we took at big science. It’s not just about open really openly releasing another tool, and that’s it. It’s about also giving the entire world the possibility of contributing on a collaborative basis to the development of these tools to learn. How do you develop these tools? And this is super, super important. I just wanted to make this point,

Luis Villa:

I think that connects to the regulatory point, right? I mean, there, there are many different ways in which you can regulate things. And certainly one thing that the EU has been focused on, and as a result, research has been directed towards, I think is explain ability, right? I, I saw an amazing demo yesterday. We’ve all seen the you know feed a string of text and, and outcomes. Well, there’s a, was a demo that just last week or maybe even early this week, time is, time is funny for all of us. Where you could, you could feed as something like a bald, angry man doing research was a, like stock demo. And he would highlight what the AI thought was bald, what the AI thought represented angry, what the AI thought represented research, right? And that kind tooling to help us understand and of course, sometimes of course the demo they chose the AI is very clearly, but there of course, other demos and things you could in there that the AI was clearly very wrong about what it was, you know, what it was guessing what it thought was represented.

Luis Villa:

And that kind of debugging tool, you know, Carlos, like you say, you can do both. I think, I think there’s a lot of value not just in regulation, but in government saying, hey, this stuff needs to be comprehensible. You know, as a first phase of regulation. And, and even though that’s not formally, well, it’s certainly not globally formal yet, but it’s clear that there’s gonna be a government expectation of that kinda comprehensibility. And so as a result, people are doing research in it. So, you know, you can do that at least as a preliminary step, even before you get to, we should ban certain kinds of things, though I think those are coming. And I certainly kits specialization, you know, kid already mentioned things like algorithms that can put you in. And as I think you know, car, as Carlos knows and Carlos has written in one of the licenses there are beginning to be some well understood areas where we, you know, believe that these are too sensitive to trust to algorithms. Now whether or not humans do a good job with them, I think is also worthwhile comparison and discussion point. But, you know, I mean, so Carlos I know has opinions on which are these things that we, that regulation is coming for first kid. I’m, I’d be curious, you know, I haven’t had a chance to do a deep dive on the White House stuff. I’m curious if, does that have a similar list of protected high risk areas?

Kit Walsh:

So at the level that I’ve looked at the White House Bill of Rights, it’s, it’s more general, right? So it’s principles which is a fine approach to take at this stage in the development of the technology. But I think to your point, there absolutely are areas where we know that using AI in this arena is just harmful or it can’t be reliable. So one example is facial recognition you say to police is just harmful, not in favor of that. And then another is predictive policing and use of algorithms to decide whether someone’s going to be held pretrial or not. And that second category is different because there’s simply no way to make those decisions based on empirical data. All of the data that exists in those arenas are about police interactions with people, not about any underlying truth about danger to others or, or crime, for instance. So you simply don’t have data, but you have information that is portrayed as data is used as data actually in commercial products. And that’s just not fit for purpose. And so that’s another sort of area where we know, you know, this data doesn’t exist. There’s no magic technology that’s going to turn it into like a wise bias free decision making tool when it’s trained on that kind of data, but that’s what’s being sold, right? So that’s very real. Yeah. I’m curious –

Stefano Maffulli:

One of the scary parts to me is one of the, the scary part of these deployments live of large systems that lack of the basic tools that we would expect in any engineered machine, like the, the explain ability, the fact that we can control their output or predict their output is, is it a matter of time to before we can have those explain ability or the better engineering, the better understanding so that we can regulate them better you know, matter of time for the research to make the progress? Or are we really, you know, doomed and and we should be having – looking at this from a completely different perspective,

Kat Walsh:

So I’m always skeptical of regulation that deals directly with the, like actual mechanical functioning of a thing rather than the outcomes of a thing. Cause like when you know, try and regulate how something works, first, you’re restricting a bunch of uses that like, may be good, but you’re also like just letting people play games with, like, working around it. I think the regulation like needs to be directed to like what gets done with it particularly to Kit’s point about the outcomes that are like that are biased because the data going into it was biased. Like we can see those there’s a lot of regulation that deals with things that we can’t know, like, such as what’s in a person’s head when they’re doing something. Like that’s, that’s not something that’s new. Even if the technology becomes, you know, even if the technology obscures some of it or makes it seem like it may be possible to be objective yeah, I think we need to like the, that it needs to be directed toward not letting people distance themselves from decisions that AI is made. That AI is a tool to that. AI is a tool to like to make decisions and we need to evaluate that as critically as we do any other tool. While we can’t just like say that the, the AI decision was somehow better because it was somehow neutral or objective, it is just taking what gets put into it.

Luis Villa:

I want to say two things there. Carlos made a point to me yesterday. Carlos and I were discussing offline licenses and which is of course a traditional tool of the open source, the preeminent tool of the open source legal community. And Carlos reminded me that things like model cards which for those who aren’t familiar is sort of a standard set of way of presenting information about a model, are in some ways just as important, probably much more so in this space, because they allow for nuanced decision making in a way that the license, the license can only set guardrails. It can’t provide the sort of informational qualities, right? So that helps Kat, to your point using this as a tool. Part of how we use this as a tool is one of the things we do to help society understand that the strengths and limitations and model cards are one tool in that toolkit.

Luis Villa:

But that’s how I do wanna push back on something you said about measuring outcomes because there are a lot of processes that we use in society that aren’t, that can’t be analyzed purely by outcome, right? You know, I was called to jury duty a little while ago. It was unclear if I was gonna, that jury is probably still sitting right now in trial. And it was two weeks to select a jury, and much of that time of selecting the jury was spent exactly, because we can’t second guess the outcome you know, so we spent a lot of time talking, essentially trying to provide model cards for each juror, right? What is this juror’s biases? What does this juror think about police, drugs, race, et cetera, right? And, and you know, so I don’t think, I mean, I do wanna be outcomes focused because that’s, at the end of the day, that’s what matters. But there’s only so much, only so much you can do there on some dimensions. That’s gotten a lot of people wanting to – David’s typing in chat. Carlos was raised their hand, so go for it.

Carlos Muñoz Ferrandis:

Yeah, no, I’m, I think I want to refocus the debate taking this context with and relate it to open source and what’s the role of open source or even OSI nowadays under the AI space. It is not about the outputs. So what we are going to share, like models, data sets, et cetera, it’s also about the outputs or practical consequences of AI central regulations. For instance, you take the example of the AI Act, one of the core pilots, or the act for the implementation of the regulation are standards, right? Technical specifications, documenting and describing the way on how an official or formal SDO, such as Europe CEN/CENELEC or ETSI describes the degree of trustworthiness of an AI system, right? So we are going to end up from here to two years with 200, 250 pages document explaining to the public or to the market how your product or your AI system has to be considered trustworthy, right?

Carlos Muñoz Ferrandis:

And you have to interpret this technical specification because we are not dealing with a technical standard in the telecom space. So with a 5G interpretability protocol. We are dealing with a standard, which is going to be also subject to the interpretation of the market. Now, how do you comply with this standard? You can build tools, software tooling, just to certify that your project or your ML app or whatever you’re commercializing is trustworthy or has an AI trust level, right? If you take the example of Singapore, Singapore Council Authority developed the last year and tested it along with some large tech companies, a product explaining basically and dealing with a label for trustworthy AI or explain ability right? Now, who owns this tool? Because this is going to be a big business, for instance, for consultancy firms or even some large tech companies are also being interested in developing these tools. Because at the end of the day, it’s not about the standard, it’s about the software reference implementation. Now, who is going to hold this defacto standard in the market? Or are we interested publicly to promote the open source release of this tooling for everyone to enjoy and be able to certify with specific a central regulations. Now there, I really see a role for open source and open source related reference implementations.

Stefano Maffulli:

Yeah, that’s probably a space that we know a little bit better, put it that way. And it’s less challenging to think of it that way, but I’m still wondering and still trying to understand whether after reading the AI Act if we can ever get to that space, to that place where we can explain why a car, an automatic driver has decided to take a route or, you know, to hit a pole to save the kitten, or, you know, the trolley, the trolley paradox and all those other things which are part of the AI Act, they’re described as examples. Kind of like that if one of the reasons why I’m thinking, do we need to wait and see what the research community is capable of offering, you know, what the outputs are or not, or do we need to move on? I like what Kat was saying about the fact that we need to think about the outputs. If I understand correctly, you are thinking about setting the, the general rules, general policies, general intentions, and then kind of the, a little bit abstract from the actual implementations a little bit like, you know the open source movement wrote to manifesto right before the manifesto, well, before the first licenses appeared. I see Louis nervous –

Luis Villa:

Oh, yeah. Well, I mean, I think we’re in an interesting time, right? I, if you would ask me about this topic a year ago, specifically around open, I would’ve said, look it costs so much to train, is really not open in this space, right? We are in a window right now of you know, huge creative ferment in open, loosely defined, right? In ML and when as soon as somebody says 250 page regulation I go, okay, well, that’s sort of the end of open, right? And that might not be a bad thing, right? It might be that we don’t, that that traditional open development mechanisms cannot actually create AI that is trustworthy enough to run in the wild, right? And I think this gets to what all of us have been hinting at in some form or another, is liability, right?

Luis Villa:

I mean, you know there is a model of regulation that’s just that instead of specifying is this trustworthy, we just place the burden of proof that if you get sued because you blow something up, the burden of proof is on you. And the AI, the Software Liability Act and, and AI liability that’s coming outta the EU right now is an attempt to do that. You know, that might be the right thing at this level, and it might be the 250 page guidelines are premature, but to be clear, we gotta do something. And I think certainly the question for Open, which I think is, you know, what Carlos is trying to get at is how does Open interact with all of this? Right? And I don’t think we know yet. I’m not sure. I hope we get a chance to find out. I’m not sure about that.

Kat Walsh:

I wanna jump in a little bit of the perspective that I’m thinking of. Like I’m, I’m thinking of particularly since this is in OSI hosted event, thinking about like regulations hampering open development by creating standards that are only able to be met by like some of the largest companies and stopping other development from happening. Like I can see those processes, those processes get captured all the time and create standards that only like the largest corporations can meet. And I, that’s the thing that I don’t want to happen. Even as I think the social outcomes are things that need to be like need to be controlled.

Stefano Maffulli:

Yeah, no, for sure. It’s one of the challenges, but we’ve seen this before. I mean software when it started to be developed, it was relegated to small departments in universities that could afford buying big equipment. And it took a while before you reached the level where anyone with a hundred bucks computer can write really impactful software. So if I go back to my time thing – I’m starting to think that is probably something there.

Luis Villa:

Well, but I think it’s an interesting thing to note there that we’ve essentially been operating in a world where software is not regulated, right? And I think one of the things that is very interesting about machine learning is that it is going to cause that, when that period of time to come to a close, right? Like, I, I think that was a time that we’re gonna look back on and think that we were all sort of naïve. And the question is what form that regulation takes, right? And I think, you know, Kit, I would love to hear your take on this, right? Because in many ways EFF has been a bastion and I say this as a, as a longtime card carrying member but of late, not a card carrying member, in part because of some ideological differences where EFF for a long time was in, was the free speech wing of the free speech movement, right? Like that we could not do regulation of software for many reasons. And so I don’t think that was wrong at the time, but I do wonder if that’s changing. And I’m curious if you can share where EFF’s head is at on that changing balance of things.

Kit Walsh:

Yeah. I’m curious about the idea that software hasn’t been regulated for the last long time, cuz we deal all the time with people who aren’t able to publish their software because of export controls or DMCA 1201 issues, or, you know, any variety as well as confronting a lot of proposals that sort of would particularly when a new technology comes along, sort of regulate too early in the sense that there are some harms. And so shut off the potential for good software to be created as well. I think, you know, we continue to think that software is subject to First Amendment protection, which isn’t an absolute, but it does require that you balance the need for the regulation and look for less restrictive means of doing the regulation before you take the approach of banning the dissemination of information such as software code.

Kit Walsh:

So it’s never, it’s never an absolute bar. And in particular you know, in the area of machine learning, there are privacy concerns that are an equally fundamental right of people which countervail some, you know, potential dissemination of information. So I wouldn’t say that it’s an absolute, I wouldn’t say that First Amendment is a ban on regulating artificial intelligence tools. But I do think whenever you’re regulating the dissemination of information or tools for creating new artistic expression, et cetera, et cetera, that First Amendment scrutiny is warranted. But again, that’s a legal test that involves some balancing. It’s not an absolute ban on, you know, the government having anything to do in this space.

Stefano Maffulli:

So I, I feel like we’re going around regulation a lot, but is, is there something else that we can do? Like when software started to come out of the research labs copyright was not consciously applied to it, it was just flowing, right? It was a new artifact being created. It was a specific policy decision by, if I remember correctly, IBM that decided to apply copyright and then it was tested in court in the eighties in the United States as a theory. And then copy left emerged, like as a hack on copyright. So do we, do we have new artifacts new technology, new tools, something new there is regulatory pressure. But if we put that aside for a moment, like as a community of practitioners, researchers, society as a whole, do we have an opportunity here to create something, new way of controlling and, and balancing the powers of creators of AI.

Carlos Muñoz Ferrandis:

So I think if I, if I may jump also linking to all the previous conversation, I think one thing that we have to realize now, just today, this very moment, is that we are in a very precious moment in time because we are observing the battle for openness, right? Since a few years now, we started to see all these big large language models pushed in the public by either large tech companies and now research communities or more startup related companies. So we are now where maybe software was what, in the nineties, two thousands with Linux. So we are now in this very moment where we are fighting this policy battle for openness versus closed or open openness equals democratizing access to machine learning versus closed equals closed machine learning is safe machine learning, right? This is the main debate that we are having right now.

Carlos Muñoz Ferrandis:

Regulation won’t come right now. Regulation will come in three years time, right? So what are we going to do literally today or tomorrow to strike or to play at this intersection between enabling open sharing or open access to machine learning and at the same time promoting responsible use of the technology? As you all may know, we were trying things or we’ve been trying things already some months ago to push for a new type of responsible AI licenses. So open and responsible license these are just the first steps. We are in this moment where the Open Source Initiative or Free Software Foundation where, or even their back, again, their proponent back in the eighties or nineties, we are there just starting an entire maybe movement. And of course, we are asking ourselves the same questions as OSI, FSF, Creative Commons, right?

Carlos Muñoz Ferrandis:

What do we do right now if we don’t have regulation? But at the same time, we want to keep promoting open access and responsible reviews. Are responsible AI licenses the main tool, the silver bullet? No, of course not. These are just another proposal by the AI community, which should be improved by taking a collaborative approach. This is why also I’m here today and Danish will be the next week. So this is one potential outcome stemming from the AI community. Maybe the OSI will have another one, and we will of course gladly back it up. Or maybe Luis, Kat and Kit have other different opinions on this.

Luis Villa:

Well, I just wanna say that I think an important thing that we haven’t touched on is that for a variety of reasons, I do think there are a lot of interesting parallels with sort of early free and open source software. But I think that one thing that is critically different and I think very interesting is that the AI community, the community of practitioners is extremely concerned with ethical questions as a general matter. That will change of course as the community gets bigger. But right now there’s some polling data on this. I saw the other day practitioners extremely keyed in on these ethical questions. And I do not think that was the case in the nineties in the software movement. There was certainly some, I don’t wanna say there was none, right? Because again, that was when I joined EFF as a member, right? Like, so these things were very real. I had the t-shirt that could decrypt a dvd key. So like, it’s not that these things are, and not that we weren’t thinking about these things at all but they were not quite in the forefront as they are in the AI community. And I think a really interesting difference. I don’t have any good conclusions to draw from that, but I think it’s a really fascinating difference.

Carlos Muñoz Ferrandis:

So I can, I can give you, if I may, sorry to interrupt you Stefano. I can give you a very practical example. And it’s not just about promoting RAILS, it’s just about our experience at Big Science, right? So this concept of open and responsible AI license is not just another hype or cool new legal tool that we came up with or developed without a practical or empirical approach to it. This was basically a response to a community deal. In this case, it was the big science communities according to their ethical or our ethical charter, our concerns on acknowledging the technical capabilities, but also the technical limitations of the model also documented in the model car. So the license was the response to these concerns. Of course, we can tie it as an ethical license or I don’t have to be honest, I don’t have a definition of what is an ethical license. What I do know is that the open RAIL was basically the response to this issue. So we took an organic approach, we had to fill the gap because there wasn’t, neither by OSI or Creative Commons or other organizations, any license covering this gap. And it’s fine and it’s fine. We just responded. Right?

Stefano Maffulli:

Right. I, I mean, I think Kat can. Yes. I, I see your mic coming up and you have served on the FSF board, you know very well the, the meaning of, of freedom zero and how it came to be, why that exists.

Kat Walsh:

Yeah. And I think I also like have developed more of a skepticism about copyright licenses as a way to control social behavior than it was when I started out and thought that that might be true. Like, a lot of us started out thinking that that might be true, and that copyright licenses could build a set of norms. And now I think that copyright licenses are great for like setting the bounds of economic rights where creators won’t participate in the system unless certain rights are protected and that they have certain remedies and copyright if like, those aren’t respected, but for determining norms. A lot of the people who are thinking about the norms, like have never even read the licenses. Like let us be clear about that. Like, they don’t particularly care about the licenses.

Kat Walsh:

They know more of like a, just the set of practices, what people are actually doing. So, you know, CC does not, well, depending on what you consider an ethical license, like CC does not have ethical licenses. But you know, you might argue about non-commercial but we do like advocate for many ways of sharing that are not tied to the license. Like, here is the license that determines what you may do without incurring liability for copyright infringement. But here’s what we think that you should do to be a good participant in this space. We say that particularly around the use of like, a lot of things that we don’t think are copyrightable, that we encourage people to use CC zero on, for example, like a lot of data sets you know, public domain, do whatever you want under copyright.

Kat Walsh:

But like, hey, if you’re participating in these academic communities, like you should say where you got the data from, because that is important for reasons other than copyright. You might have access controls on data that is particularly private or sensitive, like that is, you know, un copyrightable, but you still might limit access because of it is personal data. And like that’s not antithetical to what CC does, but we do not put that in a license. So like, I think it is good to have like sets of norms and practices and behaviors and even encoded, I’m just skeptical of like a license that has remedies and copywriters the way to do that.

Kit Walsh:

Yeah, I tend to think of the license as providing the area of certainty where you as a user know for sure that you are not going to have liability to the person who has granted you the license as long as you stay within its contours. And that’s really helpful for follow on innovation, right? That’s really helpful in the free software, open source communities for people to have confidence that yes, I’m allowed to build on this to make something new and cool, and then I’m gonna share it back potentially depending on the license, et cetera. I think it’s all, I think it is useful at communicating norms or at least the wishes of the person who developed the thing. But where it’s not so useful is it doesn’t prevent people from doing things that are fair uses, even if it’s contrary to the license, right?

Kit Walsh:

So, you know, when I create things, people are like, why don’t you put a license on there that says that like, fascists aren’t allowed to use it. I’m like, well, you know, it’s an expressive work, and they would be expressing a totally different viewpoint and like, it would probably be a fair use, right? Like for them to take my creative expression and to give a completely new message with it. And I think it’s similar, you know, you don’t need a license for anything that you’re going to do that doesn’t infringe the copyright. And, you know, we’re getting to some of the limitations of copyright as a governance tool. I think it’s absolutely worth doing. I think it’s great. Like, I think it does matter whether you have the certainty that a license gives you or if you don’t, if you’re operating in the gray area particularly, you know, fair use is a US legal regime that doesn’t extend to other jurisdictions.

Kit Walsh:

But I think it’s also, it has pretty significant limits, particularly if you think that which, I’m not saying anyone thinks this, but I have seen people think, you know, I’m releasing it under these terms and so it cannot be used in a contrary way. It could either be used lawfully in a way that doesn’t infringe the copyright, or maybe they don’t care about infringing copyright because they want to, you know, mess with democracy or do something that’s illegal for other reasons. And so you know it’s an important valuable tool that does have limits that are important to understand.

Stefano Maffulli:

It makes sense. Luis?

Luis Villa:

I was just gonna say,the number of times I’ve had to explain to people that you, that license terms are most effective against large entities with legal departments, and not against individual bad actors, right? Like that is a very long time discussion. And relevant here, right? Because some of the kinds of things that we’re trying to regulate in the AI space are very much things that governments do, things that you know, large commercial entities do. And for them, a license may be a very effective regulatory regime, right? Because their lawyers actually do read the things. But if you’re trying to say, you know, don’t use this image generator for porn, or you violate the license  –

Luis Villa:

Yeah, yeah.

Stefano Maffulli:

No, I understand. Oh, Carlos, go ahead.

Carlos Muñoz Ferrandis:

Oh, thanks a lot. So I, yeah, I agree. Definitely I think first of all, restrictions, interpretation, second of all  enforcement of the restrictions of course is such a challenge. I mean the fact that we place on an open license, some set of user restrictions, doesn’t admit that we are going to ensure a hundred percent downstream value chain control. Of course not. Now, if we may achieve to dissuade, deter some of the users or potential MIS users and or even achieve achieved one force for specific cases, yes, we are winning some very minimal percentage of misuses. Yes. Also, coming back to the regulation, I think it’s very interesting. I don’t know if you are following the parliamentary debates for the AI Act, but now they are heavily, heavily debating on whether to integrate or not specific open source provisions within the AI Act.

Carlos Muñoz Ferrandis:

And this is very interesting because they are already playing with the concept of exemption for open source models or pre-trained models. Now, of course, we are not going to enter in the definition of open source for the European parliament is not the one of the OSI, that’s I think another discussion to have. But coming back for this the Czech so the Czech proposition of the 15th of July for the act as a new article for ABC and the article for dealing with general purpose AI systems, okay. And how these general purpose AI systems have also to comply with the related provisions of high risk AI systems. Now the 4C article, 4C exemption basically that the general purpose AI system should not be or carry the burden of high risk related provisions if the a license or the model developer stakeholder releasing the model explicitly makes clear in the documentation or in the specifications of use of the model, that the model or the general purpose system cannot be used for any of the higher risk scenarios.

Carlos Muñoz Ferrandis:

So you are already placing in the market or in the stakeholder and economic incentive of integrating in the license or in the terms of use wherein you, or by means of which you commercialize the model, some set of use restrictions mimicking the AI act. Right? Now, if I have to do this, do I choose an open license or a license for a model? No period. Am I going to choose a rail or another different in house drafted license? Maybe because I will have more chances of success in terms of publishing or releasing my general proposed AI system without having to comply with the high risk related provisions, which are potentially between the tens of thousands to the hundreds of thousands of compliance costs, right? So it’s very interesting also to think about how the regulator is playing, well, sorry to say, the term playing, but maybe how they are conceiving potential economic incentives for the market to keep releasing AI, but at the same time be compliant with these high risk scenarios.

Stefano Maffulli:

This is extremely interesting because I think it goes back and brings me back to the thoughts that I was having about freedom zero. That, so freedom zero is the freedom to run the program for any purpose, which is then informing a bunch of other down the line, a bunch of the pieces of the open source definition to that was a practical choice. It was a conscious choice by the early writers, the participants, the free software community, they knew that they were writing software that could be used in a weapon, but consciously decided that the, out the possibility of regulating or, you know, setting norms, social norms and restrictions would be hampering, it would be damaging the evolution of the field, the evolution of the science, computer science.

Stefano Maffulli:

It would be slowing down and with very little possible advantages. So what I’m, what I’m hearing from you, Carlos, is that the, the European Union is thinking of giving some permissions to open source and crafting some way of saying, you know, but this regulation will be opening, leading some open possibilities for open source not being, not following the regulation fully, or some exceptions. And at the same time, I heard you saying that the research community is really pushing to have these restrictions inside the, when they release the, their models because they’re aware of the risks and stuff. Are, do you think that they have full understanding of also the, the, they’re by limiting, by putting restrictions on, by putting barriers to and, and friction to the free flow of, of models and, and knowledge, the way that they can mix and match these the, the, the research that they’re doing. They may be also slowing down. Do we do, have they, do you think that they have evaluated this possibility of slowing them putting themselves in the position of slowing down the progress of the science because of this fear of misuse of what they’re producing?

Carlos Muñoz Ferrandis:

So I think back again to my point when you place exemptions in a regulation and specifically in a sector industrial regulation like ai, so for a really edge technology, also, you have to think as a regulator that you are generating some specific economic incentive in the market because you want the stakeholder to comply with the regulation in this very same way, right? So when I’m open sourcing a system that is not a high risk system, or if it is a high risk system I’m openly releasing it with a set of use restrictions, I know I have more chances to comply with the regulation and keep commercializing my system in whatever way. That’s one point. Now the other point is that the compliance structure provided in the AI act for high risk AI systems, of course is going to be let’s say made possible or affronted by large companies.

Carlos Muñoz Ferrandis:

Yeah, it’s huge. Because at the end of the day, what you are going to do with a high risk system when this high risk system will be certified and legally able to be commercialized, is that you are going to place a nice return on investment for this company who invested in the compliance cost. It’s like a patent. I invested what, 50 K in a nice patent or 15 k in a nice patent, and now I want the return on investment by a direct return of it by, I dunno, extracting royalties, etcetera, et cetera. With this high certification program, it might happen the same.

Stefano Maffulli:

Yeah. Yeah.

Luis Villa:

I’d love to hear from kit on this because I think, you know, I was presenting the optimistic view of, hey, practitioners are you know, really thoughtful about this kind of stuff and you know, kit, you’re dealing at least some with the cutting edge of let’s just say the less scrupulous edge of those practitioners and you know, and are, and who are directly operating on some of these commercial interests, right? I mean, they’re like, Well, if I can sell the court systems, I will lie through my teeth about how neutral these these sentencing tools are, or whatever the case may be, right? Like

Kat Walsh:

Yeah, they’re actually investing legal tools to prevent transparency, right? So this isn’t a machine learning tool, it’s, it’s an algorithmic tool. But for instance a lot of the cases where, where we’ve been fighting for a criminal defendant’s right, to inspect the code that’s used to generate evidence against them, right? Like, so basically I’ve swabbed the murder weapon and it’s a mixture of dna, so I can’t just map it to one person’s genes. Instead I get a mishmash of genes that might have been contributed in any combination from, you know, any number of different people. And, and the vendor of the software says, I did some very smart math and you are very likely to have been one of the people who touched the, the gun. And then you say, Well, I’d like to see your code implementing that math and say, no, that’s, that’s a trade seeker that would harm my commercial interest in being able to sell this technology to police forensic labs if you or anyone we’re allowed to look at it.

Kat Walsh:

And they actually persuaded some courts that that made sense early on, right? And so last year was the year that we really turned the tide on that in the, in the US both at the federal and state level. And have convinced courts like that actually the, you know, the right to confront the evidence against you involves the right to inspect software that’s used to generate evidence against you to put you in jail or sentence you to execution. And so that’s sort of, you know, we, we need to, this is, and this is an issue of sort of, you could address it with new regulation, but they’re also existing sort of rights and regimes that that get at the, the issue as well. So a lot of sort of our work has been you know, I’m a litigator, right?

Kat Walsh:

I, I go to court and I, I try to win cases that are going to, you know protect people’s rights. So, you know, I, a court can’t pass a new law. I have to tie it into a right that somebody already has. And of course we also work on legislative and, and new sort of regulatory approaches. But but you know, it, it has been you know, a matter of doing some legal innovation in order to make sure that secrecy is not, is not used to deprive people of you know, insights about how these systems work and, and how they affect people. Cuz it’s not just a criminal defendant who’s impacted by that. It’s everybody. It’s, it’s the whole public who, you know, this is the criminal punishment system that in principle is doing justice in the name of the public.

Kat Walsh:

You know, everybody has an interest in in that being done in a way that’s fair and also having transparency into how it’s done. So I do think there was a discussion in the chat earlier about sort of model cards being, you know, that’s a tool that if you, if you are a technologist and you want to sort of be a good participant in this space and you wanna help people understand when the tool is going to be reliable and when it’s not, that’s a thing that you can do. And, and obviously it’s not the only thing that people should do or, you know, like mandating model cards would not be in adequate sort of, you know, regime to govern behavior in this space because there, there definitely are actors who are going to act on their commercial interests and it’s in their commercial interest to prevent criticism and scrutiny of how the tool works unless we do something, you know, to, to, to change that commercial interest, right?

Kat Walsh:

So, you know, for instance, regulation that means that, you know, you can’t sell your product to you know, a government user or you know, the kinds of users that you wanna reach if that’s how it works, or it can’t be used for a particular purpose, right? Like, you know, we haven’t even talked about are we, are we governing use yet? Are we governing like creation of the technology yet? Or you know, what’s the governance on what kind of inputs you can put into technology? Cuz those are all, you know, different and valid points of, of legal intervention.

Stefano Maffulli:

Indeed. So, so how do we keep the balance with, or how do we gain the balance between society and state actors and, and large corporations that are deploying and developing these large systems? What, what can we do? Ooh, the silence <laugh>.

Luis Villa:

I, I will say I don’t have a good answer to that. I will say that I think one thing that I find a little overstated, and this I think gets to kit’s point or kit’s experience rather. We keep sa oh, ML is so complicated. It’s so opaque. All these systems that we use and are like complicated and opaque, right? Like they’re the, I don’t know that ML actually makes that materially all that worse, right? We have no idea what our, Like you talk to advertising people about the, there’s not a lot of ml in that, but like, nobody understands why you get served, The ads get served. Like that is a, that’s a deeply incomprehensible system. Nobody like Google, you know, all these regulations of Google’s, like, you know, search results, Google doesn’t know how it’s, I mean, they can tweak it, they can put fingers on scales, right?

Luis Villa:

But they haven’t known exactly how they get their search results for a long time. And, you know, whether you call that machine learning or not oh, or my other favorite example, and this one actually has caused death the Toyota breaking stuff like where Toyota’s software for their brake system in their cars was terrible and a trade secret. So it only was available to certain you know, to, to certain expert witnesses in these Toyota unintended acceleration cases. And those witnesses, the few facts that we were able to get out of those are, those witnesses were all horrified at the quality of the code. But that didn’t require machine learning. That just required bad opaque code. So in some sense, some of these problems are not new. And I think it’s probably good for us to realize that, to, to remember that.

Kit Walsh:

Yeah, like repeating a little bit what I said in chat, but like what Kit was describing with the, like systems being used in policing and criminal sentencing, like, you know, and not even being available for inspection. Like certainly being available for inspection wouldn’t solve all of the problems, but but like, it should be a condition that, like, if it’s being used to to like restrict somebody’s rights that it be available, like that should just be the price of entry for that space. <Laugh>,

Kat Walsh:

Well, in particular, you know, it’s not the end of the remedy that you got to read it, right? Like, you find flaws, it gets disqualified, so it can’t be used now that’s on the record. And so it’s not gonna be used in other cases where you’re not involved. And when someone wants to deploy another tool like that, you can point to this example as ammunition for why transparency before its adoption as well as public participation is is going to be essential. So this, these cases are like, let’s get the first piece, let’s actually get eyes on it. And then you get instances like the technology used by the office of the Chief Medical Examiner in New York City, where yes, there’s a hidden function that might be sending people to jail improperly. And now that we have that one example, it’s a heck of a lot easier to convince other courts that it’s important to look at it. And to, to sort of build on that initial success.

Stefano Maffulli:

I mean, we’ve been advocating for open source software to be, to be used in anything that is related to the interaction with, between digital and, and real life. So that at least we can have some level of control as a society, as collective level of control. Like the contain from the f SFE and public code, Public money, public code and, and other campaigns that Cut. I interrupted you. Sorry.

Kit Walsh:

I gonna say that sometimes people here like people from our community advocating for openness as part of that solution make the mistake that we’re saying that it’s like, that will solve it. Like of course it won’t, but I think it is a necessary part of the solution.

Stefano Maffulli:

So one of the, one of the important pieces that I mentioned that you mentioned at the beginning when you were talking about the difference between da software, the classic software and ai machine learning system is the, the data and access to data seems to be one of the bottlenecks for wider participation from hackers community or smaller groups. So I, I’ve received a question from the HIPAA foundation and, and HIPAA foundation is a, is a group that is building a comment of, of datasets for medical, for medical research, huge vast topic that has a huge amount of implications on privacy and, and other things. But they have an interesting question. They, they, they’re basically saying if I, if they, they wanna know if there is a way or if you can imagine a way where we, we are building a comment of, of datasets for, for ai, but we also want, they, they were also trying to imagine a way that is some sort of copy left, where if you use this data set that is contributed by the society with the results, the ending results are also shared and available under the same conditions.

Stefano Maffulli:

Sort of, again, copy left a hack on this. What do you think, I mean, what do you think in general about the, the topic of data availability is as, as a tool to, to democratize and, and spread the adoption of machine learning AI systems first and then the availability of down the stream of trained models to, to society as a whole?

Carlos Muñoz Ferrandis:

So, yeah, if I imagine being <laugh>, the, it’s a super interesting topic. I think under the rail initiative, one of the next steps we were considering is to go for, for data license. But data license, it’s even far more difficult to craft than a model license, right? So the good data set is the main root to thousands and thousands of models. It’s not just to license a model, it’s just to license the root of it, right? To license the pile to license lion, right? This kind of core data sets when you are conceiving or designing the licensing strategy for a data set, you have to take into account a lot of proposed proposals and just maybe taking side use based restrictions. Also, was the main proposed within the value chain or the data set to accomplish, is it training?

Carlos Muñoz Ferrandis:

Is it validation testing for which specific ML model under which context? So there are a lot, and do we have a transparent tracking or history of the data set? So all these considerations have to be taken into account when drafting a data license. And I’m not even talking about an open data license. We have some open data license or tentatives out there. The Linux Foundation has, there are three data licenses. The folks from the Montreal Data License Initiative issued as super nice papers some years ago. I can I can share it in the chat. So we have some first attempt, but it’s such a challenging

Stefano Maffulli:

Yeah. Data are static either, which is another thing to keep in mind

Luis Villa:

That they’re either, I mean, this is the, this is a problem that Open Street map runs into a fair bit. And I think something that is gonna be a real challenge for the, for how to put the open community is a global community. The ML community is a global community. The legal systems and the regulatory regimes that we operate under are not global. We sort of took for granted, I think in the open source community that the burn convention for those of you are non-lawyers watching the burn convention, is the copyright treaty binding, What is it? It’s everybody except Panama or something you know, binding every country on earth to the same basic platform. What I, what I’m speaking to programmers, I talk about Burn as a platform that other copyright law B builds on, and it gives us a global set of principles and ideas that we can use to build, for example, a global open source software license that more or less works certainly with a lot of edge cases, but more or less works in all regimes across all legal regimes across the world. And, you know, data is licensed very differently in the US from the eu or Japan or Mexico, all have different data licensing regimes. And that’s just speaking of the database rights, much less the privacy rights, the liability rules. It is not clear to me how we rules for global communities that based on non global legal systems, I, I still don’t have great ideas there and, and it troubles me.

Stefano Maffulli:

I think Cut is the, the probably the person on the panel with the most experience on this <laugh>.

Kit Walsh:

I was just thinking that copyright doing a global copyright license is hard enough, even with the platform of the burn convention. Data governance is not my area of expertise. You know, I know just enough about it to know that it is a dangerous field to wade into, and I usually need to consult, like, you know, many different council when trying to do something that’s cross jurisdictions.

Stefano Maffulli:

But I remember the Creative Commons initiatives at the beginning, they were translating the licenses in different jurisdictions mm-hmm. <Affirmative>. And I think that approach has been changed, right?

Kit Walsh:

That it has initially the licenses were based like on a background of US law, basically, like, it was a small, it was like an experimental project, like version 1.0 of the licenses were kind of like, let’s see if this takes off. And they didn’t work in all jurisdictions. So the later two and three versions had international, like reported versions that were intended to function the same under the background of different countries law. One of the big things that the most recent version does did was try to see if those can all be addressed in one version. And we’ve, you know, done that as well as, as well as we can. We haven’t seen edge cases where that doesn’t work. But, but it doesn’t address some of these other questions like, it, it just CC takes the position that data is un copyrightable and our licenses are copyright licenses.

Kit Walsh:

So we don’t deal, well, we just sidestep a lot of these hard issues by saying like, you need a separate agreement for that. And if you want to, and this, this separate agreement is like, I think what we need to like, think about with these with these data sets that have like issues outside copyright, like privacy and we do, we do take a small position on database rights saying that they are licensed under the same terms, basically. But but that doesn’t address all of the issues where, where other regulations will come into play other than just the, like sew s rights and things.

Kat Walsh:

Yeah. And I think, you know, one way you can do it is if you are the sole holder of that collection, right? You have put that collection together, you can require people to come to you and enter into a contract with you that has the terms that you want to impose, right? It’s certainly not as elegant as, as, you know, an open license that gets to flow without, you know, a single repository of or, you know, you could have a few different, like people that you trust to, you know, give out those, those rights, right? So, but you know, that’s one option. You can write whatever you want into the contract rights, of course, neither a copyright license, nor, you know, I who have slurped data off the internet giving you contractual rights. Neither of those can address the privacy rights of individuals who may be reflected in that data, right?

Kat Walsh:

Like when I uploaded my picture to Flicker and put a, a CC BI license on it I didn’t waive my privacy rights in, in terms of, you know, it being used for police facial recognition. That’s not a, that’s not part of a copyright license. So I think, I think there are approaches. They’re not as, you know, fluid necessarily, and they, depending on the, the data, it might not, you know, it might not be practical. But and I also don’t know every country’s contract law. You know, you could run into issues there where, you know, these are terms that just aren’t enforceable. You can’t, you can’t require that in a contract. But, you know, if I were, if I were trying to keep a data set used only under certain use provisions, that’s probably how I would do it.

Stefano Maffulli:

Do we need, do we need to invent something here? Is there a space or is there a, We uncover the need to get started, like come up with I ideas for, to solving, for solving this issue of creating data sets, assembling them, distributing safely according to conditions that the creators are willing to, to accept or promote.

Carlos Muñoz Ferrandis:

So if, if I may jump in, I think – and to agree with Kit’s point, by the way. I think it’s quite challenging. You have for both models and, and data regardless of whether we think these are copyrightable or not, we have also to take a look at empirical evidence. So facts happening right now in the market, for instance, at hacking phase the top license used to release models is an attached to 0.0 license. And the third or fourth one is a cc BI license for dataset. I think it’s a CC license, right? Then you get out and you see the last release of OpenAI, Whisper MIT license, the first one in back in 2019 for OpenAI, GPT-2 MIT license, right? Copyright based. Facebook ones copyright based. So there’s a tendency in the market we can discuss whether copyright is the right tool or approach or not, but I think it’s not even about taking just this purely or narrow regulatory approach to copyright and its economic role in the market.

Carlos Muñoz Ferrandis:

It’s about what open source and creative comments achieved with the licenses. Because we do not care anymore about copyright or not. A model developer doesn’t think about this. A model developer consumes a license as a carrier, as a social institution. That’s the main tool. That’s the main role of the license, whether you put in copyright or not, that’s fine. For a model developer, for a data set, for a guy wanting to release a dataset, the main point is just to take a license to pick a license and open source license. Sometimes I’ve seen dataset release under a GP three license, So this is what the market thinks, or model developers or maybe just the technical context thinks about open source and credits. This is the reaching power you have, and it’s a lot,

Kit Walsh:

At least, at least for a lot of uses in the us, like a lot of, a lot of those uses are, are considered to like, I can use data under any license I want. It’s like a transformative use. It’s not all use that implicates rights under copyright. So how much the power does the copyright license have.  CC has faced this problem because there’s a lot of criticism of for example, all the photographs on Flickr being used to train these facial surveillance systems. Like, is there anything that, you know, is there anything that CC could have done about that? We actually did some, some work with Amanda Lewandowski’s over at Georgetown and you know, all of her papers on AI and copyright and, you know, other rights implicated or wonderful, and people should read them by, I know Lewis cited one earlier, but like there, you know, a lot of their analysis was like, yeah, that, like, that there was a very strong, there was a very strong case for fair use. Like even if CC’S license had put in a restriction, it probably would not have stopped a lot of these bad uses, like, at least in the us which is where, which is the, you know, where I, where all my experience is based

Stefano Maffulli:

Louis,

Luis Villa:

Well, in the copyright license, I don’t think would’ve, I mean, the EU specifically with good intent put in a you know, a copyright law provision that essentially makes training opt out, right? So now the still privacy law, of course, still applies very, very strongly there, but it’s not, but the copyright license can only do so much there. I mean, you know, stepping to your question of is this something we should be thinking about? I think there are a lot of people who have given some thought to this problem, right? Like, there’s a interesting argument around data trusts. There’s, but the ease of use of, Hey, I just put this one thing here, <laugh> you know, I put this one text file and this one directory, and you know, is is both like literally of use in the sense of you’re not setting up an entity, you’re not setting the, and, and the cultural lease of use, right? That you, you know, when you ask other lawyer, when you ask other programmers, not lawyers, oh, how do I control how people use my thing? Well give a copyright license, and that’s you know, that’s better for worse. And I certainly take some of the blame of this myself. That is a norm that we encouraged in programmers for a very long time, and I think now we’re running into the limits of that,

Stefano Maffulli:

Right? Right. But there is an argument made about the more, more data, I mean, more data means more accessibility to machine learning, and that’s one way of, of balancing the power of large corporations like Lewandowski’s argument. So since we’re coming up to the, towards the end of the panel, I would like to close with, with one thought from all of you to think about what are, you know, what are the things that, what is the one thing that you imagine seeing migrated from the open source cultural not, not just open source, but open data, open knowledge, all the opens that you can imagine that can be transferred to AI in order to achieve, okay, utopian future where this ai, these AI systems are deployed for the good of society. What would that be?

Luis Villa:

I’m gonna put, I’m actually gonna push back a little bit on that by saying simply, I, I feel like predictions in this area are almost sort of pointless – it feels like the scope of the change that is coming is so vast and radical that it’s going to be very hard to do anything other than sort of one foot in front of each other, kind of, you know, build with I mean build with, like I said, the good intent that is there and do that as quickly as we can, which sounds sort of pessimistic, but is also very, I don’t know. I often analogize this to the printing press, and the good news about the printing press is that it made us all, like, you know, it was, is incredibly good for society also, it caused a hundred years of bloody warfare. So, I dunno. So, that’s where we’re at. Sorry to end on that note from me, but I’d love to hear hopefully more optimistic things from others.

Kit Walsh:

Oh, yeah. I would, I was gonna try and jump in and like, say that we’ve spent this whole panel talking about the harms and the importance of regulating the harms and like controlling access. But like, but I do, like, I want to be a technology enthusiast. I just like want certain conditions to be to be met first. Like I’ve played with some of the AI art systems and like some of the AI like writing systems and like I think they’re like I really like the idea of like, the possibility to open up, you know, open up creation to more people, to give more people like the time and facility and like, abilities to do that. I will say I’m probably the person on this panel with the least programming experience. You know, I have some like, basic ability, but like an AI assistant would like greatly increase my ability to do that.

Kit Walsh:

Like, like that’s a, that’s an, you know, amazing thing. And that would open that up to so many more people. So like I would love, I would love to see us basically create rules that would keep that openness, like keep, keep that generative ability, which we have seen with other technologies like printing press and like software development in general. And yeah, and I think, I think that we can, if we can find a way to, to balance regulation of the harms with keeping that system open. Like I think that AI in, in general has a great possibility to like enable more people to create and to participate in creation

Carlos Muñoz Ferrandis:

Or, Yeah, For me for me, I think it’s what I would like to, to see in the future is what we just enjoy today. And also tier Stefan for, for this. I think we, we have to appreciate that we are here some representatives of different associations or initiative three different open licensing or licensing communities, right? Creative comments, OSI rail initiative. Sometimes we might have some frictions, we might have different interests, but we are here today for a common end. We are collaborating together. This is just the start. And this is fantastic. I mean, this I’m 28, I’m a 28 year old guy. One year ago I was doing my PhD in law and suddenly I got into developing a license for a large language model, right? And now I had a chance to discuss with lawyers that have, what, 20, 30 years of experience. This is the type of collaborations we should welcome and we need, right? This is the way to go forward. I don’t have right now the response or the answer, but this is just the beginning.

Kat Walsh:

Oh, thanks Carlos. We should, we should let you go last. Cause that was very hopeful and positive and I, I’ll try to end on a positive note as well, but I do, you know, I was thinking about the the way that the tool potentially opens up some very, you know, powerful forms of expression or social change to a wider range of people. I was thinking about, now I’m thinking about the printing press, right? And what was the, the reaction to the printing press, the reaction to the printing press and a lot of places was to lock down who has access to it in order to maintain control, right? That’s how we got the statute of Anne. That’s how we got copyright in the first place, right? Is the reaction to try to control who’s able to disseminate ideas. And Cat has brought this up a couple of times you know, being really vigilant against regulatory solutions that mean that only the already most powerful actors are able to take advantage of the promise of this kind of technology.

Kat Walsh:

So I think that’s, that’s a lesson. I don’t know if that’s a lesson from the open source movement. It’s a lesson from, you know, 15th century Venice that might apply to to the world of AI. I think I’ll leave it at that because you know I and I do think, you know, there are things that we learned from ar I’m not gonna leave it at that. I’m gonna go on. There are things we can learn from artificial intel from, you know, open source and free software in terms of you know, the rights of people who use and are affected by the technology that gets made. And I think you can translate a lot of those principles, you know, are definitely still valuable and are a good way to start thinking. And also to spot the ways that this presents new new challenges to address. So I think this is the positive note is thank you for convening this series so that we can all, you know, puzzle through these things together.

Stefano Maffulli:

Thank you. Thank you and thank you all. This has been fantastic for me and my positive note, and I, I keep on thinking today was already shocked again by the fact that these AI, many of these AI systems are really giving access to a lot of art. Like the integration today of Microsoft PowerPoint with Dolly. So lots of clip parts now can be generated without having to search like the idea, Oh, I have a cake picture, picture of a cake in here. So it, it’s gonna free up a lot of time, and I think it’s gonna give us good and hopeful, hopefully good outcomes. So thanks everyone again for joining. We will be continuing this series next week with a panel. The next panel is on it’s a legal panel and we will have speakers from the OSI IBM Research, the American Civil Liberties Union, Washington and Hugging Face. So stay tuned. More to come. Thank you.

Carlos Muñoz Ferrandis:

Thanks.

14 responses to “How can the society as a whole maintain control of AI systems?

Reposts

  • MrBadger42
  • Tech Tweets
  • Cristóbal García 🇪🇺🇺🇦