Artificial intelligence and international arbitration: uses and challenge

Clara Ruiz Garrido.

2023 International Arbitration Outlook Uría Menéndez, n.º 11

At times, it seems like everyone is talking about Artificial Intelligence ('AI'). While some rejoice in its potential to positively transform society and stimulate economic growth,[1] others claim that it 'might destroy our civilisation'.[2] Within the arbitration community, there are those that say that AI is 'perfect' for dispute resolution'[3] and others that raise concerns about its shortcomings in terms of accuracy, privacy, confidentiality and security.[4]

Amid all the noise, the purpose of this article is to take a step back to understand how the technology works and to shed some light on its main potential capabilities and limitations from an international arbitration perspective. 

Artificial intelligence: what are generative language models?

Since its inception in the 1940s, AI has developed significantly and it now has a wide range of applications.[5] However, it is only in the last few years that AI's potential to drastically change the way we work and live has truly begun to be realised through innovations in the subfield known as 'generative AI'.

Generative AI is a type of technology that uses existing data to create new and original content, including text, images, video and audio.[6] While research in the area of generative AI has been ongoing for several years, the recent refinement of generative AI models and their public release has catalysed their adoption and scale.[7] In this article we focus on generative language models, as they will potentially have the greatest impact on international arbitration. Among the most popular generative language models are ChatGPT, developed by OpenAI, and Bert, developed by Google.

Generative language models are AI models designed for natural language processing tasks such as text generation and language understanding.[8] They often utilise deep learning techniques, particularly recurrent neural networks or transformer-based architectures, to capture the dependencies and relationships between words in a sentence or a sequence of words.[9] While we will not explore the technical specifications of these models extensively, it is important to understand (albeit superficially) how they work.

First, generative language models are fed with huge amounts of pre-existing text data, such as books, articles and websites.[10] The models then analyse the data and are able to learn patterns and relationships between words and phrases within the dataset.[11] Once this learning process is completed, they are able to generate new text by predicting the most likely next word or sets of words based on the preceding ones and patterns learnt from the data.[12]

Generative language models do not 'understand' text or language as we human beings do. Instead, they generate text 'by making probabilistic guesses about which bits of text belong together in a sequence, based on a statistical model trained on billions of examples of text pulled from all over the internet'.[13]

As simple as this process may seem at first, generative language models represent a substantial advancement in the state of the art when it comes to their ability to understand text, synthesise new text, compose ideas and reason.[14] Compared to earlier models, generative language models such as ChatGPT show enhanced contextual understanding, which allows them to better comprehend and respond to complex and nuanced inputs, making them more effective in generating accurate and relevant text.[15] Among their most impressive achievements are passing a simulated bar exam with a score in the top 10% of test takers,[16] writing poems[17] and programming code that actually works.[18]

Generative language models and international arbitration: potential uses and limitations

Generative language models demonstrate remarkable capabilities in producing coherent and contextually relevant text. However, impressive achievements seldom come without challenges. Generative language models also exhibit significant limitations that may hinder their immediate applicability in the field of international arbitration. The key ones are (i) factual inaccuracies or 'hallucinations'; (ii) sensitivity to biased training data; (iii) limited attention span; (iv) confidentiality concerns; and (v) outdated information.

First, generative language models produce text by providing the suites of words that are most likely to be encountered in a given context, but the most likely response is not always the most factually correct. This can result in a model providing 'plausible-sounding but incorrect or nonsensical answers'.[19] An explanation for this may be that the model's dataset on the topic in question has insufficient data. An unawareness of this limitation has already caused some unfortunate incidents, such as the well-known case of a lawyer in New York who may face sanctions for citing fabricated cases in a court filing he created using ChatGPT.[20]

Second, generative language models are trained on a vast amount of data that may contain inherent biases. Because they function as probabilistic models, they may generate biased outputs and replicate or perpetuate the biases within their dataset.[21] Arbitrators should factor this in when utilising AI in their decision-making process, as it could introduce or amplify existing biases in their decisions.

Third, generative language models are still unable to process large documents and respond to questions based on information in multiple locations in such large documents.[22] Consequently, they are not suited to generating very long texts requiring persistent context; summarising large, complex texts; or consistently remembering constraints in the conversation.[23] This could significantly limit their use in international arbitration, where cases typically involve vast amounts of documentary evidence and lengthy documents.

Fourth, it is crucial to be aware of the confidentiality risks associated with utilising publicly accessible large language models. For instance, ChatGPT's privacy policy makes it clear that ChatGPT 'may collect Personal Information that is included in the input, file uploads, or feedback that you provide' when using its services.[24] It is therefore important for users not to share private information with the model, especially for arbitration practitioners and others who have confidentiality obligations.[25]

Lastly, generative language models do not always contain up-to-date information. In the case of ChatGPT, its database does not contain information dating from after 2021. Arbitration practitioners therefore need to go to other sources for more recent decisions and other information. 

Overcoming limitations

While overcoming some of the limitations explained requires technological advancements that are still far from being realised, others may be addressable in the short term. In the legal field, for instance, confidentiality concerns are being tackled through the development of generative models that employ solutions such as data encryption to protect client information. Notable examples are Harvey AI or Robin AI, which are already being utilised by law firms like Allen & Overy or accounting firms such as PwC[26] for tasks like legal research, contract analysis, due diligence or litigation. These models are specifically trained in legal data, including case law and reference materials, which reduces the risk of 'hallucination'.[27]

In the context of international arbitration, apart from overcoming the data privacy concerns, the models will have to be fed with an appropriate dataset, which will inevitably be comprised of thousands of transcripts from actual arbitration proceedings, sets of rules (arbitration rules, arbitration laws, bilateral investment treaties), arbitral awards and other arbitral decisions (such as procedural orders), law review materials, etc.[28] An obstacle to the creation of this dataset is the confidential nature of many arbitration proceedings, which limits the amount of available information. Furthermore, the dataset will have to be updated continuously to accurately reflect the current situation and how both the law and case law have evolved, which is another significant challenge.

Once the confidentiality concern is sufficiently addressed and the risk of 'hallucination' reduced -and as long as we bear in mind the remaining limitations-, generative language models have the potential to significantly improve arbitration proceedings in terms of time and cost efficiency. It has been suggested that generative language models could be employed to assist with (i) summarising and synthesising evidence; (ii) translating evidence and other documents; (iii) drafting legal documents (such as parties' submissions, procedural orders or even non-substantive sections of an award); (iv) predicting the outcomes of an award; and (v) even selecting arbitrators.[29]


Generative language models offer powerful and revolutionary capabilities that have the potential to greatly enhance the productivity and efficiency of legal professionals, particularly in the field of international arbitration. The models' ability to comprehend and analyse vast amounts of information is particularly valuable in this domain, which is characterised by extensive documentary evidence and lengthy legal documents.

However, it is important to recognise that generative language models still have notable limitations that hinder their immediate application in international arbitration proceedings. The primary concerns include their lack of factual accuracy and potential confidentiality issues. To fully leverage the benefits of these models, it is imperative to address and overcome these limitations.

While advances are being made to overcome these challenges, we must remain vigilant and consider the remaining weaknesses of generative language models. Human judgement and oversight are essential to ensure this technology is used responsibly and effectively. Continuous monitoring of future developments is vital to fully harness the potential of generative language models while also safeguarding their responsible use. By staying informed and adapting to new advances, arbitration practitioners can capitalise on the benefits of these tools while maintaining a responsible approach.


[1] International Chamber of Commerce, 'ICC policy statement on Artificial Intelligence' (21 November 2018) <> accessed 13 June 2023.

[2] Y. N. Harari, The Economist, 'Yuval Noah Harari argues that AI has hacked the operating system of human civilisation' (28 April 2023) <> accessed 13 June 2023.

[3] CiArb News, 'AI Technology and International Arbitration - Are Robots Coming for Your Job?' (3 February 2023), <> accessed 13 June 2023.

[4] H. Falkiewicz, Arbitras The Hague Blog, 'Artificial Intelligence in Arbitration', (9 October 2023) <> accessed 13 June 2023. See also V. Basham, The Global Legal Post, 'One-in-five large law firms issue warnings over use of generative AI or ChatGPT, survey finds', (21 April 2023) <> accessed 13 June 2023.

[5] S. Russel, et al., Artificial Intelligence: A Modern Approach (Pearson: 2016), pp 28-29.

[6] P. P. Ray, 'ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope' (2023) 3 Internet of Things and Cyber-Physical Systems, 121, p 121; McKinsey Explainers, 'What is generative AI' (January 2023) <> accessed 13 June 2023.

[7] A. Kucharavy, et al., 'Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense' (2023), p 1 <> accessed 13 June 2023.

[8] P. P. Ray, 'ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope' (2023) 3 Internet of Things and Cyber-Physical Systems, 121, p 121.

[9] A. Kucharavy, et al., 'Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense' (2023), pp 2-3 <> accessed 13 June 2023.

[10] M. Abdullah, et al., 'ChatGPT: fundamentals, applications and social impacts' (2022) Ninth International Conference on Social Networks Analysis, Management and Security (SNAMS), Milan, Italy, pp 1-8.

[11] For a more detailed explanation, see A. Kucharavy, et al., 'Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense' (2023) pp 2-5 <> accessed 13 June 2023.

[12] See A. Kucharavy, et al., 'Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense' (2023) pp 2-5 <> accessed 13 June 2023.

[13] K. Rose, 'The Brilliance and Weirdness of ChatGPT' (5 December 2022) in The New York Times <> accessed 13 June 2023.

[14] A. Prakash, 'Emergent Properties of Large Language Models (LLMs) including ChatGPT' (23 February 2023) in ThoughtSpot <> accessed 13 June 2023.

[15] P. P. Ray, 'ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope' (2023) 3 Internet of Things and Cyber-Physical Systems, 121, p 122.

[16] OpenAI, 'GPT-4 Technical Report' (27 March 2023), p 1 <> accessed 13 June 2023.

[17] J. Cushman, 'ChatGPT: Poems and Secrets' (20 December 2022) in Library Innovation Lab <> accessed 13 June 2023.

[18] S. McManus, 'Friend or foe: Can computer coders trust ChatGPT?' (31 March 2023) in BBC News <> accessed 13 June 2023.

[19] Open AI <> accessed 13 June 2023. See also A. Kucharavy, et al., 'Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense' (2023) pp 20-21 <> accessed 13 June 2023.

[20] M. Bohannon, 'Lawyer Used ChatGPT In Court—And Cited Fake Cases. A Judge Is Considering Sanctions' (8 June 2023) in Forbes <> accessed 13 June 2023; The New York Times, 'Here's What Happens When Your Lawyer Uses ChatGPT' (27 May 2023) <> accessed 13 June 2023.

[21] E. Ferrara, 'Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models' (20 April 2023), p 3.

[22] A. Kucharavy, et al., 'Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense' (2023) pp 1, 22 <> accessed 13 June 2023.

[23] Idem.

[24] OpenAI, Privacy Policy <> accessed 13 June 2023.

[25] While the use of AI in international arbitration may also raise implications related to intellectual property ('IP') rights, it is worth noting that these concerns are more pertinent to the broader application of AI rather than being specific to international arbitration. Therefore, in the context of this discussion, we will not delve further into the IP implications associated with AI. For more comprehensive insights on this matter, see, for instance, G. Appel, et al., 'Generative AI Has an Intellectual Property Problem' (7 April 2023) in Harvard Business Review <> accessed 6 July 2023. 

[26] C. Criddle, 'Law firms embrace the efficiencies of artificial intelligence' (4 May 2023) in Financial Times <> accessed 13 June 2023.

[27] Idem.

[28] P. B. Marrow, 'Artificial Intelligence and Arbitration: the computer as an arbitrator, are we there yet?' (2020) 74 Dispute Resolution Journal 4, 35, p 36.

[29] L. F. Souza-McMurtrie, 'Arbitration Tech Toolbox: Will ChatGPT Change International Arbitration as WE Know It?' (26 February 2023) in Kluwer Arbitration Blog <> accessed 13 June 2023; P. P. Ray, 'ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope' (2023) 3 Internet of Things and Cyber-Physical Systems, 121, p 136.

Other publications