Jocelyn S Paulley
Partner
Co-leader of Retail & Leisure Sector (UK)
Co-leader of Data Protection and Cyber Security sector (UK)
On-demand webinar
60
Jocelyn Paulley: Good morning everyone and welcome to the fourth in our IT Masterclass webinar series developed specifically for in-house counsel. Good morning, hello everyone and thank you for joining us. I'm Jocelyn Paulley, I'm a Partner in our Commercial IT & Outsourcing team at Gowling WLG, and I'm going to be chairing our Q&A session today with Matt Hervey, a Partner and Head of our AI team and today's topic is on intellectual property strategies for data and artificial intelligence. The background for today's session is that machine learning based on large data sets is disrupting fundamental business models across multiple industries in all sectors and ultimately even society itself is going to be affected. Clearly there's going to be winners and losers in that process so Matt is going to talk to us today about how to protect your AI investments and how to unlock and realise the value in your data.
Just some housekeeping before we begin, please do put your questions in the Q&A box on your screen as you go along and as they occur to you, we'll pick them up. Any we don't pick up as we go along we will deal with at the end. I can also confirm that the session today is going to be recorded, it will be available online afterwards and we will also circulate a recording to all of those of you who are attending.
The format today will be slightly different to our traditional IT Masterclasses if you attended the others in the series. Matt and I are going to have more of a Q&A conversational style session so we're not heavy on slides today so don't be concerned if you think your slide hasn't changed that you have in front of you.
OK, without further ado here we go. Good morning Matt.
Matt Hervey: Morning Joss, how are you?
Jocelyn: Very well thank you. So you are Head of the AI team here at Gowling WLG. AI will strike many of our attendees as a new legal discipline, so how have you come to be interested in the area, what's your expertise and who else in your team?
Matt: So I've always been a technophile and before I got involved with the AI project I always did tech patent cases so lots of telecoms work, video compression, that sort of thing. My interest was sparked back in about 2011 in a wide magazine article on autonomous vehicles which was research ongoing under the Egis of DARPA in the US since 2001, I hadn't previously heard of it and as a tech lawyer I immediately saw the issues for liability, for standard essential patents and all sorts of unresolved legal issues. Then brilliantly actually ThinkHouse got me thinking about this in terms of AI, and I gave a couple of talks over the last few years on that, and then I realised actually there are so many general issues, cross-practice issues, so affecting competition law in terms of access to data, IP rights about how you protect this, product liability, radical changes to employment if AI automates more tasks, and so I realised that really it actually required someone to pull all this together and so I've written a book on that with Sweet & Maxwell or I've edited it with about 20 odd authors from academia, the bar and private practice to really capture that sort of holistic view others required. As a firm we fully embrace that and we are gathering together a cross-practice team of experts, globally as well I should add, so that when any client comes in with any sort of project involving AI we can immediately give them an holistic view because you may want to do one thing in terms of the business case but you, Joss, me say there's an issue with data protection, our regulatory specialist may say there's an issue with disclosure or proof of safety and I as an IP lawyer may have concerns about how to protect it and then Bernardine Adkins, our competitional specialist, may say that there will be this is competitional. So I think, whereas some firms are looking at this mainly as a corporate transaction issue, we are very much making sure we can give holistic advice on here.
Jocelyn: As an IT lawyer, I am used to hearing lots of buzz words and say fads in the technology space as new technologies come to the fore, everyone gets very excited and then we go through a year or so of all being very excited, and they don't deliver or they're not adopted by as many or as quickly as we had hoped. Do you think that'll also be the case that they are, or are they going to be different, somehow?
Matt: I fully believe it's going to be different, and I have spent a decent chunk of time on other fads. 3-D printing was expected to have a big impact on IP rights because if you could manufacture any product at home, it would change the whole market, counterfeiting would be a real issue, and I'm glad to say that I called it "hype" at the time. I personally am a bit of a blockchain sceptic as well. I think, in the vast majority of cases, a mere database would be perfectly adequate but really it's like crypto-currency or for immutable records held by governments where there's a real application, and a lot of it's just hype and will go by-the-by.
But AI I think is really, really different, and that is because since about 2012, there's been a real step change in what can be achieved with AI, and it really affects almost every industry and I think that's because AI has now unlocked just a couple of really intractable human skills, particularly vision - so the ability to see and understand the world - and also to understand language - and if you think of it on those two bases, once you've unlocked those two, you've unlocked so many human activities. So self-driving cars is a remarkable technology, but really, the difficult bit was vision. Everything else is just physics and mechanics. It's not that interesting in terms of the technical challenge, but once you have vision, you can literally change your cities as a result, and the way we live. So the implications are vast, but also the reality is actually here, so self-driving cars are really imminent in terms of technology, at least at what we call level four, where, in certain circumstances, they could be wholly autonomous. It's really actually the legal issues, the legal impediments which are holding them back, so the likes of Tesla, in the States they say they're ready to deploy, but the estimates, at least for last year, were that there might be six years' work just to get the legislation changed to make it possible. Hence, our role in all of this to make this possible, and I want to make a few other points about why it really matters.
First of all, there's a McKinsey study on the adoption of AI by businesses, and back in 2018 they'd already found that a fifth of all European businesses were actually investing in AI and in some industries, it's closer to 100%25. So for example, in the insurance industry, everyone seems to be piling in, and it just gives you a sense of its extent when the ball gets rolling. The importance has been recognised by governments, so AI is now the focus of the industrial strategies of most developed nations including the UK, and various governments and inter-governmental bodies are working on specific regulation, and that never happened for 3D printing and it never happened for blockchain - they were fit into the normal regulations. We have specific regulatory frameworks being drawn up for automotive, for aviation, for medicine, for example, and at the EU level, and at our national level, there are general regulations being written for AI specifically. We have an office for AI and we have the centre for Data, Innovation and Ethics in the UK, so it's actually created new Government bodies, so I think that gives you a sense of how important this is.
It's also been recognised by investors, and the UK enjoys the third highest level of investment in this field of any country in the world, and we have an amazing track record, so Deep Mind was sold to Google for half a billion, Benevolent AI is valued at well over £1b and they are both AI poster-children for the UK, we have a Centre of Excellence at the Alan Turing Institute which pulls together our five leading universities and we have clusters at Cambridge and in London of serious expertise. We advise many start-ups out of the Cambridge-London corridor, particularly bringing together AI and life science. I think the other issue is its potential to disrupt, so life sciences, again, big pharma are now looking at a field of over 1000 new start-ups, really trying to eat their lunch, and they're beginning to invest in the technology as well, and automotive is a full-on existential race to see who wins the market and wins the future of autonomous driving. So really, I would suggest strongly that any in-house lawyer should at least know the general shape of the area in terms of the technology and in terms of its legal implications.
Jocelyn: It certainly seems very different to previous fads where you have world start-ups but not necessarily that engagement from existing larger companies and particularly what you might call more traditional industries - you mentioned insurance, you didn't see them rushing to adopt some other technologies. It does feel like it's going to be different.
Conversations I've had with clients and in the works that you and I increasingly seem to do together in this space, the other fundamental element of AI in its value is the data that's either pushed in to help train, or indeed the data that results once the artificial intelligence has crunched all its numbers and worked its way through the process and the software and technology is nothing without the data that feeds into it and results from it, so it seems to me very much that the two go hand-in-hand. So should companies also be thinking about the data as well as an AI strategy or AI investment?
Matt: Yes, and I think that's a really important point because when you think about it, even if a company doesn't want to develop AI itself, even if it's not a tech company, it might well be sitting on data which would be of use to those people who wish to do that. I'll explain why the data matters, because AI is really old. People have been researching it since at least the 1950s, but there's the step change I mentioned, in around 2012, was really the leap forward in a particular form of AI called machine learning, and in essence, it's a computer programming a computer. It's really that step change which has allowed us to solve those impractical problems like vision, because we don't have to know how the system works - we just need to know the data so that it can figure it out for itself, and really it's supervised learning where data comes into its own, and in particular, large, even ultra-large labelled data sets. So ideally you'd find something pre-existing, and so you might have to have, for example, data bases in medicine of x-rays which have been labelled with the diagnosis or even better what was finally established to be the case of a x-ray, and you can use that body of x-rays with the labelled information, so that a machine can learn to label a new x-ray and therefore to diagnose what's going on with a patient. Or you have to generate the labels - but again, you need the raw data of some sort, so for example, if you have a network of CCTV cameras, there will be companies who will want to purchase your video feed and then get your video feed hand-labelled to say that's a pedestrian, that's a car, that's a cyclist, to generate training data to go into machine learning, and so there's a hunt constantly for suitable data sets, and the really interesting thing is the value is so unclear at the moment, and changing, and I'll just give you a flavour of that.
First of all, the value of your data may be of totally different value to different potential collaborators depending on what their aims are, and to give you an example, CCTV again of a street - an individual data source may contain so many different data points for different people, so CCTV might be useful for geography, for road layout, for driver behaviour, for traffic flows, for weather, and it might be very valuable in terms of weather and not very valuable in terms of road layout, but you never know, you just need to experience this and to value as best you can. Also your data may be radically different in value to a collaborator depending on what data they have. Your data may unlock the rest of their data in a key way, so that it becomes particularly valuable to them and the other thing about data is it's inexhaustible in the sense that I can licence it to multiple parties and the data doesn't disappear - it has different values to different people, so you have to constantly consider that, and the market itself is evolving.
The EU is deliberately introducing measures to create public data pools and to create the EU as a data market place. They're also looking at enforcing data sharing, so to break down the incumbents the big incumbents, particularly, the big US big tech companies who sit on so much data, they have such an advantage, so that may radically change the value of data. Then, of course, there are all these concerns via regulation on privacy, data bias - and that will affect the value of your data. Is it actually GDPR compliant? If it's not, does it have any value? Can it be used safely? Is your data legal, and are you liable for that data? So if your data isn't clean, if it's actually biased, what sort of warranties are you going to have to give? What effect will that have on the value?
So, really, I think at the moment, the simple best-practice rule is giving the uncertainty of value, do not give away your data, do not exclusively licence your data. Keep as much control of it as you can, because it may be more valuable tomorrow.
Jocelyn: I think that's a new mind-set for a lot of companies, particularly in traditional industries to think of their data itself as an asset where they have to think about protecting it, controlling it and valuing it, as you said, which is quite a difficult jump if, traditionally, you make cars or you're in insurance, or your producing drugs - it's a very different world to be thinking about. So what, as lawyers, are the tools we could help give people to help people to help them think about protection, you are an IP lawyer so I am imagining there's a strong IP bent towards certain IP rights you could use to help protect both the data and the AI developments themselves?
Matt: Yes, the interesting thing is the IP is my main concern and the thing I'd always turn to first, and it is undoubtedly, key consideration in AI. Some purists would say that trade secrets isn't an IP right but really that's where the game is in town, trade secrets and contract. But let me just talk about traditional IP rights proper, just to give you a sense of the challenge. So I think the key assets when it comes to AI is your development tools, so you can have a platform for developing AI so tends to flow from Google as a platform anyone can use and that will obviously attract copyright as a programme and branding and the like. You got your learning techniques, you got data processing methods, because you can't just put data into an AI, you need to process in various ways, cleaning up the data, reduce the number of pixels, that sort of thing.
You have your trained AI models, so once you've actually done the training process, you may have a freestanding piece of software essentially, or something you baked into a camera which would have your visual analysis, so a camera that can identify pedestrians and the like. Then you have products of AI, so this is their predictions or maybe you have a journalist AI that's writing copy for newspapers or you've got a creative AI that's creating paintings or just identifying targets of pharmaceutical research.
So you've got all of these potential assets and you're trying to fit them into traditional IP rights which simply were never designed with AI in mind and deliberately excluded data in the form of information, so just to give you a feel for that, European patent convention was finalised in 1973 and that is still the rules for what inventions can be protected, and at that point, computers were using punch-cards - they just weren't thinking about inventions by AI and the like. Also, information is expressly excluded from copyright protection by very old international treaty, so it was never the intention that these sorts of assets, well, they were never properly considered, let's put it that way.
So, if I look at patents first. There's been a massive increase in patents relating to AI over the last decade, I mean at least 800%25 rise, but they're applications, not necessarily granted patents, and it's relatively hard to get patents in the AI space, and that's because, as I mentioned earlier, research has been going on since the 1940s and 50s, and a lot of the fundamental ideas are actually very old and can't be patented as they are already known. Secondly, patent protection excludes protection for mathematical methods, methods of doing business, and computer programs as such, and that carves out many potential AI inventions. Now, I can protect, for example, methods of preparing data, and methods of training on data, you know, clever little twiddles on how AI is done, but certainly not the data itself, that just not within the scope of patents.
Then on copyright, data in the field of AI is quite a broad term, so it can just mean mere information, or it can mean the stuff you put into your AI, so when a computer scientist talks about data, they literally mean photos or video streams or handwritten notes, so they might well be copyright, individually - photographs, medical descriptions, articles, that sort of copyright, but not mere information, so not the extracted data, not weights and measures of your goods, so that sort of thing. Copyright is also suitable for computer programs, so if you've written your AI platform, if a human has written it, the form of expression, bizarrely, it's a strange thing to think about computer programs, but that will be protected, so someone can't just copy your computer program. But sadly, they are entitled to copy its functions, so copyright is no way to protect the functions of a computer program. When it comes to works generated by AI, so its predictions or an artwork or an article for a newspaper, it's not really clear where we've ended up on that, because the UK had the intention of protecting such works and has a specific provision - section 93 for computer-generated works - but since that, the EU has harmonised the test for originality to be the author's own intellectual creation, and even our own UK IPO now says they don't know if section 93 still works, because of the later law in Europe and they're consulting on that point at the moment.
Then finally, in terms of traditional IP rights, we have database rights, and they expressly exclude the data itself. It's all about the right kind of investment when it comes to sui generis database rights, so that's an EU-specific right and it won't cover databases created by UK entities from the beginning of the year and we're hoping for some sort of reciprocal right we'll see, but frankly, it was never much used, because there was never any reciprocity for similar rights outside of Europe anyway, and very rapidly, the CJEU really cut down its effect, because it's all about investment - you have to have the right kind of investment for a sui generis database right and that's investment in obtaining or verifying or presenting the contents. The case law appeared to say that if you were generating the data for your day-to-day business, so if you were a pharmaceutical company and you had a load of clinical trial data to prove the efficacy of your drug, you didn't invest for obtaining, verifying or presenting the database, you were investing to get your clinical trial done - and so there's this sort of rule of thumb that it doesn't apply to spin-off data, which is a huge carve-out, and has really emptied the database right of most of its potential value. I think recent commentary has suggested that's an un-nuanced approach and there is no absolute rule against spin-off data, and the other thing is the rise of AI in particular in machine-learning, is that there's so much pre-processing of data that actually there's potentially the right kind of investment there, so if you can show that to verify and present the data correctly for ingestion, as it's called in machine-learning, you may have made that investment, and you may get a database right.
Jocelyn: As you said, the existing traditional IP rights do not lend themselves easily to the new structures, the new process, so we are doing as lawyers often have to do in the technology space and working with the rules that we have to try and see and understand how they might apply or be useful to us in our new technology. So you have talked about a few traditional sorts of IP rights there, and said there is some level of assistance but there are quite a few difficulties and caveats as well, so what is your advice on the best way or the way that is most likely to succeed to use the existing powers as a protection?
Matt: So I think there is still scope for traditional IP rights and they should be pursued where they are likely to work but it's so new, some of this, that in terms of the importance of data, that there is just a lot of uncertainty of how it's going to come out. So I think the absolute best practice, the first ingredient is trade secrets because they are clearly broad enough to protect information itself so mere data. Broad enough to protect your algorithms even though they are functions that cannot be protected by copyright. So that is definitely the way to go and this is really illustrating the states with the litigation between Waymo and Uber. So they had a dispute as to autonomous vehicle technology and it was not a patent dispute, it came down to trade secrets, and I think that is very telling for where the real rights are probably going to be.
So in the UK we have always had, in recent history anyway, a common law right to confidential information. Then we have got a harmonised EU Trade Secrets regime which has been moved into our national laws. Really, in order to protect your trade secrets I would recommend a mixture of practical and legal steps. So practical measures to keep information secret. It is things like restricting physical access to your secrets. Restricting electronic access, so make sure employees have levels of permissions and passwords and their access logs to core secrets. Then you can use electronic measures to monitor suspect activity, so if data has been transferred by email or to cloud storage or to memory sticks you can get an alert. Then it is just things like staff training and labelling stuff as secret so people know if what they are dealing with is a secret. Then the second fork of that is legal measures. So make sure with collaborators and guests NDAs are in place; make sure that there are proper terms of employment that deal with your secrets; staff handbooks and policies, training, that sort of thing. Ideally have a plan for if there is some sort of breach because the problem with a trade secret is, if it gets out, it is too late but you may be able to use legal and practical measures fast enough to prevent dissemination even if one person is trying to take your trade secrets.
But the point to really emphasise here is, under the harmonise regime, the very definition of a trade secret requires you to have taken measures. So it literally says that a trade secret has to have been subject to reasonable steps under the circumstances by the person normally in control of the information to keep it secret. So my other practical piece of advice is, any measures you choose to protect your trade secret should be measures you can easily use as evidence in court. So things which generate their own logs, things which are documented so you won't have a problem there.
I would just warn of one other thing. There is a particular risk when it comes to AI and that is because the demand for people with skills in AI so far outstrips the supply, that there is a shortage and people are in high demand and they make good money and they are a very mobile workforce. So trade secrets is challenging. They are techy, they are millennial so they don't necessarily see jobs for life, they are part of the start-up culture, and so I think you have to be extra specially careful with trade secrets when it comes to data science basically and the AI technology every more than traditional areas of research.
And then the second key issue I think here, and this is why Joss and I have ended up working together more than ever, is contracts. So the issue I have with IP is, one it may not apply, and two you don't know who is going to own it for sure. There are some default rules in copyright and patents but you can clear all of that up with a good contract. So take it away, tell me what I should do with contracts.
Jocelyn: Yes of course, as you say contracts are such an important tool in this area because I think, and Matt has made this point but just in case anyone did miss it, there does seem to be a myth that you can own data. Often if I have conversations with clients they say, well it's my data of course a supplier can't just go off and re-use it. But the reasons as you have explained that the fundamental proprietary legal concept of ownership in fact some figures are out, as you say the world as we know it, it just wouldn't go around. So what we have to do in contracts is exert rights contractually to control access to, use of, many uses of that data. So contracts are always important in English law because we don't have a constitution, you set down what you want to be agreed in that contract and in this area, more important than ever, around use of data. You can use all the language and tools that we do for traditional software licencing and indeed of IP rights because if you look at a SaaS contract it often talks about granting a licence despite the fact that technically there is no need to move that IP or you are just accessing a service which is using software in which there are intellectual property rights. So it is still totally valid to use all those same concepts that you would do for licencing, IPR, or software around data. You can treat that asset in a contract as if there is IPR in it even though technically that does not exist outside of the contract. You can use all that terminology so you can talk about a right to use data being revocable or irrevocable, just for the term of the contract obviously thinking about it as an important point. Is it further sub-licensable by the party to which you are granting this licence? We are involved in schemes where we have data being licenced to re-sellers who go off and licence it further and clearly there are big players in the data market. You look to the likes of Experian and companies like that whose whole market is bringing data sets together. You can talk about data being licenced within a territory or outside of it and the key that is always the permitted purposes; what are you setting out this other party can or cannot do with that data. Usually couched in terms of business type purposes or for use in particular products, but you can frame it how you need to for the context of your particular contract.
As well as all the traditional aspects we are used to thinking about there are some new and different things to think about in an AI context. So if you are putting data into a system, as Matt said that is going to be used by the training system or it is going to be ingested and then run through a set of criteria and learnings so it is going to be combined quite usually with other data. Are you happy for that to happen to your data? It might be then if you are not happy that is not the right tool to use but as with SaaS contracts and anything in IT it is understanding the processes that are going to happen to understand if it is a risk that is appropriate or not. The data can be combined, the question is, is that combined in a way it can then be separated at the end of the day or is it then inherently within something else, some kind of derived data or output or imbedded in something else. And who then has what rights around that something else that is inextricably linked with someone else's data sets or even multiple third parties' data sets if you are working with an aggregator and they are combining sources from multiple partners. And this multi-faceted point is another different element to play around with in contracts because typically what Matt and I are seeing are these are not just two-way contracts, we don't yet have an established eco-system where you have one supplying to another and building on a technology stack. These are multiple players from start-ups to big companies to traditional players coming together to collaborate.
Matt: In different roles as well, so you have the data sites, the data supplier, the customer, the data platform and all these people have to work together.
Jocelyn: And all bring value. So it's not clear if you are just sort of following a traditional customer supplier thinking, who is going to have what rights in these combined outputs at the end, so it is critical you think through as clear as you can be in your contracts, with your defined terms, it can be quite difficult. I have seen contracts talking about combined data, manipulated data, underived data and trying to separate out those threads to be clear as to who is doing what.
Particularly on exit, make sure it is clear if you want to get back what you put in, put that as the case. Think about people needing to keep copies for backup for insurance, other regulatory reasons. I know, Matt, we are going to talk later about how critical regulatory aspects can be when you are thinking about AI. Also you raised the point earlier about warranties. If I am providing data into a system, who is an expectation that comes with any kind of warranty that this data is accurate, up to date, it has retained its integrity, what effect a duplicate, contradicting values? Or if you are going to be a user of that data, what is the impact on your use case if in fact the output you have got from a machine learning exercise where it has learnt either incorrect things or its extrapolated correlations that we don't agree with in the real world? We have seen plenty of cases recently of discriminatory results in criminal law enforcement or in recruitment exercises where the data that the AI has used to understand its world has inherent discriminatory aspects because of the way the world has developed.
Matt: So that can mean your tools don't work very well but it is also a massive reputational risk. Well it has the potential.
Jocelyn: And lastly obviously regulatory breaches either from a GDPR point of view or within an employment context. That sort of leads you on to thinking about liabilities as Matt touched on earlier that of accuracy of data but also you talked about looking at what data you have within your business and working out how you can re-use it. I think before you work out the how you have got to work out, can I re-use it. If I am repurposing this data, then were there any restrictions that apply because I got it originally from a third party. So actually although it is within my estate, agAIn it doesn't mean I have rights to do whatever I want with it, I have to go back and see what were the contractual controls in place. From a GDPR point of view there is clearly an issue with re-purposing of data because you have to be fully informed as individuals how you are going to use their data and tell them about it, when you collect it rather than later down the line once you have decided you might want to use it for something different.
Then finally in a liability sense, thinking about reliance on outputs. If I am saying "here I am, I have a tool, you can use it for these, it's very helpful for supporting decision making, you can make diagnosis, seeing the world around us and making real time decisions", what is the impact on you as the provider of that tool if in fact it provides you with a decision at a point in time or a diagnosis that is not the right thing to have happened in that scenario? If you take a very traditional software supplier, you put that kind of hat on and you are thinking, well they would always say "well, we are not here to insure your business, if there is a bug in our software then we'll do our best to put it right but you can't hold us responsible for that and try and pass that to me through a damages claim, all the full impact that may have and ramifications on your business because, as an external provider, I can't anticipate what they will be". And with that hat on I can very much see the use of AI tools going down that route. So be aware when you are using them what they say around decisions supporting tools or information for you to look at but you can't rely on it solely as a basis for a decision.
As Matt flagged before, confidentiality is obviously key. I would look specifically at exactly how confidential information is defined, because it is unlikely to pick up things like derived data or combined results of different confidential inputs so you do I think want to play around with those terms to be very clear what is covered and make sure that your practical real world treatment supports the fact this is actually confidential data. Think about how it is being stored, is it labelled or if your underlining data is going to be exposed through use of a tool, do you want any accreditation to be attached to it, so agAIn it can be traced back.
And a few clauses we are starting to see creep into contracts more commonly, even if it is not a contract to do with AI, if it is one where there is a significant data element, suppliers are starting to be more open and up front about the fact they might want to re-use this data for their own purpose. Historically it might have said just for their own product development, quality management type purposes, but increasingly I am seeing wording alluding to something much broader, much more commercial and revenue generating so do watch out for those. And also be alive to the fact if the contract is silent, it is still something you want to think about because I think there is an interest as Matt has said quite rightly in understanding the value of data that you hold and seeing what you can do to unlock that.
And the know how point which you raise there about a very mobile workforce. I think this used to be an issue tens of years ago when software development was more in its infancy and the techniques that people were using to develop and code software were highly valued. I think we are in that space agAIn now with AI know how where it is actually incredibly valuable and hard to protect either as confidential information or as an IP right it does fall in one of these gaps. But contractually you need to work out is the very know-how that has come out of a collaboration, is that worthy of protection in some way and think about how you want to treat it.
Matt: Yes, I think a little bit in terms of the classic confidential information issues, can the employee hold it in their head. And so, if there is a technique they have discovered as your employee which they could take in their heads and you could not realistically protect, in that sense under contract or under confidentiality, that's where a patent does work. If it is genuinely new, if you can get a monopoly right, even if that person can leave and take it in their head, they are not allowed to use because you own the monopoly. So, you definitely need to keep cross-checking agAInst traditional IP rights and to see how you can achieve your AIm.
I think the other point in which we overlap a lot is that, not only are we working individually on specific tools to achieve your AIms in terms of IP strategy, but also, and you have touched on this a little already mentioning privacy and GDPR. You really need a holistic view of all of the context, whether it is your business model or whether it is legal in terms of let's say GDPR whether it will be affected by regulation and also whether it is ethical. All of those I think are actually feeding to certainly what you do as a business generally but even to your IP strategy. And just to focus on one element of that, and that is disclosure. There are so many ways in which you might be required to disclose elements of your AI or your data and that means that a trade secret then suddenly isn't the solution, the contract may or may not be adequate protection and it may be that you do need a monopoly right such as a patent where possible. So you are an expert on explain ability under GDPR so, what sort of disclosure requirements may that entail?
Jocelyn: So this is all about when you are dealing with personal data specifically; being able to explain to individuals what data you are processing, on what legal basis, so why you are processing it, what you are going to do with it, and thinking about AI, there's an element of automated decision making, letting people understand that, understand what effect it could have on them and ultimately giving them a right to come back and challenge it. The regulators are pushing us to provide, well people operating these tools and in control of the data, to give greater and greater transparency, for an irregularity and these are not easy concepts to explain in layman's terms. So even the challenge of breaking it down in terms of just language that you are using is not an insignificant one. But as you say Matt, if I have to set all that out to individuals, that might involve disclosing a lot of what's going on within the tool and it sort of affects the whole strategy. It also means that if you are using these tools it is so important to ensure you are not being given a black box because if you can't explain that process, you are going to fail a regulatory hurdle, and particularly the individuals are allowed to challenge decisions that are being made about them, you as a controller are going to be left in a very difficult position if someone wants to challenge that decision and you go back to try to unpick how it has been made. If you have a black box, you are not going to be able to meet that hurdle.
Matt: Or you can keep a human in the loop or do something to mitigate the risk of using a black box. Because I think a lot of machine learning techniques do generate a black box and I think this is going to be an issue, it is currently unresolved frankly, but in transport for example, and I think in medical devices, where you use a black box and it's going to be in some way involved in life and death scenarios, proof of safety for a regulator is going to be critical. And this is a work in progress. So for example, EASA the European Aviation Safety, they are working on regulation because they want to have autonomous drones in the skies or to be allowed by about I think 2035. But their draft documents say, look, we're going to have to figure out how you prove safety and while we are it, we don't have the skill set or the people in-house to know how to figure it out yet. So it's a real issue. But agAIn the prediction is some sort of disclosure is going to be needed and various bodies have been coming together on how to do this regulation and they have already been raising concerns about how you ensure that the disclosure doesn't in fact destroy your IP. So this is a live issue and one that I think is evolving and people need to consider because it's going to change your IP strategy as you go. I think it's not just aviation or medical devices, it's really these general regulations as well, the need for robustness to avoid bias, to achieve transparency. Irrespective of GDPR, in terms of ethical reasons, general regulation, reputational protection, we just don't yet know what sort of disclosure is going to be required and I think the final ingredient is reverse engineering. So there is increasing awareness that you may be able to reverse engineer data out of an AI product, so if I have a camera that can label pedestrians, I can keep it running and generate a new dataset of labelled data and then use to train my own AI, thereby essentially reverse engineering the product, and I think there's agAIn some uncertainty as to whether you can effectively exclude that under contract because our software directive in Europe does prevent certain exclusions to understanding the workings of a product where you have lawful access.
So very much a very moving picture, but all of that means that a monopoly right such as a patent may be the only answer, and so this is why you need this holistic view and working through in competition law as well, no point having a trade secret if you're going to be forced to divulge your data because of some sort of antitrust action so agAIn very much, you think this is just an IP question but actually it's much broader when it comes to AI.
Jocelyn: I think we have to throw ethics in as well, don't we, because as we said at the beginning the law at the moment isn't up to speed with where real world developments are, things don't quite fit, but this is actively happening on the ground already, companies are developing, using, data scientists are investigating what is possible, so ethics is having to step in to that gap. This is somewhat difficult and frustrating for a lawyer because it's not a world that we deal with because it is more subjective and more conceptual but this is having an early impact on the sector rather than the laws which are on their way but not here yet.
Matt: Yes and I've certainly come across companies where they now have internal ethics teams who have a very much a regimented, documented process where they have, like you have in privacy you have your considerations and you document it, same for ethics and no they're not lawyers they're typically engineers and people like that. I think it pays off to do that because if you you're considering these things you are going to hopefully reduce the potential for harm and for reputational harm and hopefully you will predict where the law is going to go and some of the bigger companies I've worked with who had these sort of ethics teams they are literally working with their regulators and they're actually sort of steering the direction of a regulation itself by that involvement and minimising their risks.
Jocelyn: Well the very fact that we have a centre for data ethics and innovation in the UK as one of these founding bodies it is helping to shape the way that AI usage will be regulated, judged, I think is as you say indicative of how the government is dealing with it at the moment. I'm conscious of time, that's been fascinating, hearing more about this industry. Do you have any final thoughts you want to give as to sum up on some of the ground we've covered?
Matt: Yes absolutely and we've prepared a slide as well just so that people can see it. I've prepared four key takeaways, the first is AI may affect all businesses and that's what we talked about at the beginning, the scope of the impact. So machine learning has cracked these key computing problems particularly vision and language and that unlocks applications across all industries, and I think whether or not one company wants to engage their competitors may, so everyone is going to be forced to engage ultimately and as we've discussed you may also have data, even if you don't want to develop the technology yourself and that is something you should protect because it is so important to so many machine learning techniques. Because it's an evolving market and data is inherently hard to value at the moment, the best practice is to just maintain ultimate control wherever possible.
Then third IP rights weren't really designed for AI so trade secrets and contractual measures are definitely the go to at the moment. They are also cheap, their registration, no renewal fees and I think contract is incredibly important in this field because ownership, there are some default rules in IP but they are so hard to apply to AI and so you really want to make sure you have clarity of that.
And then finally the thing we just finished talking about is this need for a holistic view of all the legal parameters, the ethical parameters, the regulatory parameters, even, not just for your business case as a whole, but even for your IP strategy and that's really because the need to disclose things or to collaborate with people may mean you need traditional IP right and monopoly right in order to avoid losing something if you've only gone for trade secrets.
Jocelyn: Brilliant thank you very Matt, that's been absolutely fascinating. We've got some time for questions. There is one so far which we can have a look at, which is, I think this might be a bit of both you and I answering here.
Matt: That's what's happening these days.
Jocelyn. Yes. "How can you reconcile a desire to retain datasets that might become valuable at some point in the future, with data privacy obligations around retention, for example the CCTV image that Matt you mentioned earlier?"
Matt: I'm glad to say this is a question for you isn't it, it's whether you can keep hold of data? I'm going to sit back and enjoy, go on.
Jocelyn: So I agree, this is clearly a challenge because the purpose of having something like CCTV images is normally relatively short term, you need it for protecting employees, protecting property, in the event that it comes to light that something happened that harmed either of those you can have recourse to the CCTV to look and see what went on, so normally it's quite hard to justify keeping it beyond, I've seen 90 days, but much beyond that not very much.
You do have a technical argument with CCTV that it's not personal data until you actually review it to work out who is that person walking across the car park and taking a sledge hammer to somebody else's car, because up to that point it is general surveillance, you're not interesting in who was coming and going until you come to look at a specific point in time or have a specific question. So a bit of a technicality there that might get you around it being personal data and therefore able to retain it for some longer period, but you do need to have an eye for principles like data minimisation or is there anything I can do to mean that should this be released in an uncontrolled way it wouldn't actually cause any harm or difficulties to individuals whose images are there.
So you're going to have to think about what setting is this in, as anything in a public setting is much less likely to cause harm to individuals if it's released because it was of course in public to begin with, versus somewhere very deep in a technical laboratory of an R&D organisation where much more fundamental things are being protected and who is coming and going could be much more significant for those individuals. I think you would need to look at context there as always, as with anything privacy related it is absolutely critical.
You could also think about how you frame your privacy notice around some of these points to see if there are more concrete things you can put in in your privacy notice to alert individuals that data you are retaining might be used in different ways in the future, that is challenging because regulators do value specificity and push back on very general wording that we did see commonly before GDPR came in. So it is a balance and it's one of those risk areas and any decision you make here absolutely as Matt said, take the privacy approach, document it so you can understand later why decisions were taken and maybe practical measures that were put in place as mitigations and complementary measures.
Matt: I think another factor there is the extent to which you can extract and anonymise data that is valuable, and then discard the original data, but that has all sorts of issues with data robustness advice and whether you need to keep the original data for that sort of ethical checking later. But I do know that for example local councils are selling extracted elements of CCTV footage to support the development of self-driving cars, so they need evidence from CCTV cameras for driver behaviour. But I think, it's not openly discussed so I don't know the details, but I get the impression that they have enabled the local councils to extract what they need in terms of the heuristics metrics, or what the machine only actually needs so they never see actually footage, they never actually see licence plates, and stuff like that.
There is another question from Omar whose son is very sensibly reading computer science and I totally agree, physics is another good thing to do because a lot of what you do in physics supports a lot of data analysis, so those are very valuable degrees at the moment. I've encouraged my own children to do that but they're having none of it and you've asked, I think for yourself, not for your son, if it's too late and what course should you do.
The truth is that as an IP team for example we love it when we get a physics graduate through the door because generally speaking they go to banking or somewhere they have a different career path, or a start up, and I actually think there is very little prospect of getting people with the ultra-valuable skills into legal careers and so it's really up to lawyers to train themselves. There are frankly phenomenal resources on YouTube to explain the fundamentals of how these systems work and I think depending on the work you're doing, you don't have to understand every last scientific nuance of how AI works, you just need to with your legal hat on, see how for example an AI related product may change once it's deployed. The fact that different people have created different elements of the system and so there will be complications with liability.
I think in terms of science learning loads of online sources and then in terms of legal developments, really the European Union, the commission is doing so much work on liability, IP, UK IPO is looking at this, there's screeds and screeds of online material at the material for where this law is going.
Jocelyn: There is one final question and I'm aware we're approaching time and I don't think is a two minute answer Matt. "How can algorithms be protected in the UK?"
Matt: Trade secrets, or if the way in which they are expressed is copyright but that's I think a fairly thin form of protection, because you can copy the functions and then if it's an inventive a new and inventive algorithm with a technical effect it's not a computer program as such, potentially a patent.
And then someone's asked about open source. I don't think it's specific to AI tools but open source, agAIn this is more Jocelyn's thing but open source you have to be careful because some open source is under different licences, sometimes it can open up your software to being open source and the like and you have to be careful.
Jocelyn: Yes absolutely, the open source has the copy left effect whereby it 'infects' if you are thinking about protecting proprietary software, your proprietary software and it all becomes subject to that open source licence, so if you are going to distribute the software you would then have to distribute the source code of your own proprietary software. So it comes down to exactly how you are using it and incorporating it with other code because whether it's dynamically linked or it's fundamentally embedded will affect whether it has that copy left effect and then ultimately whether you're going to be distributing it as well would make some open source more of an issue than others.
I'm conscious of time so I think I'm going to bring it to a close there. Thank you so much Matt for your time today and giving us your insights on the world of AI.
That was the final in our IT masterclass webinar series for this year. The previous three sessions are available online and this one will join it next week, so if you want to look back at any of those that you missed they will all be available for you.
Matt's book published by Sweet & Maxwell, or will be published rather, come the end of the year if anyone is interested in that, that will be the authority for the time being on the law of AI. We will be back with IT Masterclass some time next spring whether face to face in our traditional format or like this will remain to be seen, we're in the hands of something that's beyond all of our
Machine learning based on large datasets is disrupting fundamental business models across multiple industries and is expected to affect all sectors and society as a whole. There will be winners and losers. Matt Hervey looks at best practices for protecting investments in AI and unlocking the value in data.
NOT LEGAL ADVICE. Information made available on this website in any form is for information purposes only. It is not, and should not be taken as, legal advice. You should not rely on, or take or fail to take any action based upon this information. Never disregard professional legal advice or delay in seeking legal advice because of something you have read on this website. Gowling WLG professionals will be pleased to discuss resolutions to specific legal concerns you may have.