Machine learning, finding the first application in the IT sector, quickly began to penetrate in the first region adjacent to IT, and then to other markets. Experts in the analysis of the data, they date-scientisty, today to help in factories, banks, construction companies, sports clubs — the list is endless. Was no exception and law. I watch this process for about 8 years and recently arrived in the “Yandex”, the conference Data&Science to talk about big data in the work of lawyers.

Data&Science — initiative “Yandex” on the development of the community, sientists and their convergence with potential customers from other areas: some people tell others about their challenges and find ways to help each other in various projects. Preparing my talk at the conference just was the reason to write this article. I will describe how machine learning permeates the law, why it is happening slower than we would like, and what the future of smart services will help to make legal processes more efficient.

For the uninitiated

First explain what machine learning is. If you are familiar with this concept, feel free to skip to the next paragraph.

Machine learning (machine learning, ML) — this is when a lot of data for some topic is loaded into the system in order to identify these data patterns. The resulting “knowledge” the machine can use in several ways. For example, downloading an array of voice recordings and texts, we will teach the system to speak and understand speech. This is how all the voice assistants: Apple’s Siri, “Alice” from “Yandex”.

Learning your musical tastes, services like Apple Music, “Yandex. Music” and Spotify, then I can recommend you a new track. Examples are many, but the idea is the same: to load a lot of data and make them intelligent service, it simplifies your life.

How it all began

Machine learning — example of automation, so let’s start with a simple question: what were the first examples of automation of work of the lawyer? Let’s be honest: law is a field where a lot of routine. What programmes has helped lawyers make their work easier? Of course, the very first and most important such program that became Microsoft Word. Its appearance in 1983, totally changed the process of preparation of legal documents. Numerous possibilities of text formatting were not previously available to a wider audience. Perhaps it is the fact that Word resonated with lawyers, and has contributed to its popularity among millions of users.

In the late 80’s-early 90’s has occurred the next leap in the development of the industry, which we now call legal tech: a legal-reference system. The best known examples in Russia — “Consultant”, “Garant”.

Legislation changes all the time, and lawyers, it was important to quickly find out about every change. Of course, the main simplifying factor here was the advent of the Internet, but even before that people have found ways of updating the directory.

In the early 90’s, if the articles of some of the law was varied, customers who purchased reference and the legal system, I left the couriers with the floppy disk containing new versions of documents. In addition, very useful was the ability to view (for example, in “the Consultant”) history of changes to any articles. The fact is that by studying historical material, the lawyer must understand what the legal rules are in effect at the time.

Not less than a milestone in legal tech — release ABBYY FineReader that can translate scanned documents into text. By the way, in modern versions of FineReader, of course, does use machine learning.

e-Discovery — the “driver” of the market

Fast forward closer to the present. When ML first got into law? Actually — not much later than in other areas, despite the conservatism of the industry. In the United States a much greater role than in Russia, plays a process called Electronic Discovery, abbreviated as e-Discovery. It is a broad concept relating to the exchange of legal documents digitally, but I will focus on the specific example of e-Discovery: when one side of the judicial process sends the second party a list of documents for review. All of these documents, according to the first part, make sense in the context of the case. This list may consist of 2 million documents (I’m not exaggerating), and it is clear that all of them do not study.

The first known use of machine learning in the legal field was the prioritization of documents for e-Discovery. The system analyzed the contents of the list and offered to review only the most significant part of documents — about 300-400 thousand.

The need to simplify the process of e-Discovery was the catalyst for the emergence of the first ML legal tech start-UPS in 2010-2012. This direction so far — along with research and management of contracts between firms remains one of the key legal tech. The best-known startups, connecting jurisprudence with IT, achieved success due to the efforts in the direction of e-Discovery. Examples are Relativity from Chicago, and Everlaw of California. Software and algorithms for e-Discovery in its own universal — like solutions are used in corporate and financial investigations.

Here lies the answer to the question why in Russia is less than legal tech companies and very little, if we talk about ML legal tech. Our trial does not imply the need to learn as many instruments. In accordance with Russian procedural codes, the plaintiff along with the lawsuit itself provides documents that prove his arguments and demands. So the automation, as in the case of e-Discovery, is not necessary.

Where this applies

Before moving on to the other reasons not too fast development of the industry, will talk about the positive — about the successful examples of the application of ML in law and related fields. In the decisions our company, the machine learning algorithms used for time tracking. The fact that many lawyers provided hourly payment with the client. We analyze and generated behavioral model of the user, taking into account the types and different task according to his deeds (and given their complexity). In addition, we look at how much time he had spent on these and similar problems: the lawyers usually indicate elapsed time.

In America, a system to assess the likelihood that the accused will again commit an offence. The judge may recognize this possibility and take it into account when making the decision to plea guilty or not guilty.

In the American police ML-model predicts the place of Commission of future crimes and shows them on the map of the city (!).

Known fact: when the search for precedents or similar cases, made up its list is often incomplete (and judges pay attention to it). Another thing is when search used ML. Another example: the accuracy of artificial intelligence 9% higher in tasks of inspection of documents-disclosure (NDA), and the time the machine spends ten times less: 26 seconds versus an average of 92 minutes in humans.

The counsels ML helps too. A few years ago, the largest law a company Dentons via its venture Fund financed the Ross system, which is trained on the history of bankruptcy. Ross should be able to give advice to the customer who asked for advice. Dialogue with Ross in real time, in natural language and without the participation of a lawyer. The system is based on technology stack IBM Watson. Similar robot — only advising on the Russian law on protection of consumer rights — introduced this year, the SKOLKOVO resident “Legislator.”

Why the future is yet to come

But why in the legal industry are still few examples of applying ML? To answer this question, first make sure that they are really lacking. One of the largest law firms DLA Piper (with huge by industry standards resources) not so long ago admitted that leverages machine learning and artificial intelligence by only 1%. Company Transactions mentioned in the preceding paragraph, is also introducing machine learning into products is very slow, in spite of their initiatives.

One of the problems is that digitized the documents on which it would be possible to train an intelligent system. For example, employees of banks do not always digitize loan agreements. Paper-harder to store, them sometimes difficult to access: professionals have to travel to the archives and to remove paid-for copies, not to mention the lack of a search by such materials. In addition, paper documents are easier to forge.

In Russia, a widespread practice of falsification of contracts during insolvency: when the assets of a bankrupt company are transferred to the creditors, among them you may find about a lender — the firm that actually gave a loan to the bankrupt, but which has a fake document, which States the opposite. If fraudulent the firm manages to deceive the court, she goes to some assets, and some of these creditors this part loses.

The introduction of the blockchain and other smart payment technologies and the digitization of all contracts at the stage of conclusion would solve the problem — but the industry is not yet ready for such steps.

The very form of partnership inherent in law firms, is not conducive to investment in IT. Profit if it happens to be distributed here and now — on the solution of urgent problems and not on the development with the prospect of improvements in the future.

Another reason for the slow penetration of the ML — the lack of General registers. In the US there is a system called Public Access to Court Electronic Records (literally “public access to electronic court documents,” abbreviated as PACER) — but it aggregates only the documents of the Federal courts. All other courts in America to any single system are not connected. However, this gives them some freedom in the choice of technologies. A nice bonus is and a large number of IT-research conducted in courts of various instances.

Ending about the American judicial system, I must say nice initiative of the company Ravel Law and the law faculty of Harvard. They recently announced the completion of the project Caselaw on digitizing the largest history of court cases in the United States — from 1658 to 2018. This is 6.4 million cases and more than 40 million pages of documents — which are now available data cientista. By the way, data analysis in law and based on texts and numeric data derived from the texts.

We can assume that the additional complexity in the ML legal tech creates a lack of transparency in data. Indeed, in our world less than IT common API, open database, and so on. Each company values their data and, as a rule, does not want to share them for absolutely free. On the other hand, start-UPS successfully cope with the closeness — at least by the fact that there are already being affiliated with any large organization.

Legal characteristic and all the problems that arise at the intersection of ML with any other industry. For example — the lack of clean data. Need to double-check your scanned documents to monitor the correctness of the tables used in the training.

There are positive trends

“Slow” law firms increasingly have to work with much more fast legal departments of large companies (including “Yandex”), where the processes of record keeping are well automated. It gives them the right example to follow. In addition, they often attract some tasks of the firm are not from the legal sphere — which are also automatiseret processes and serve as an excellent example.

Another positive trend is the growing rate of new start-UPS in the legal tech. In Russia we see more and more young professionals, already know what the problems in the industry can be solved using ML, and not wanting to be lawyers in the classic sense of the word. Even in comparison with the 2012-2013 year, these young people became much more.

The examination requires support from the major players — and it gradually appears. The initiatives of the Federal tax service, we can assume that they collect a lot of data and implementing smart technology.

Sberbank is also fully automatiseret, and in fact it is the largest Bank in Russia and one of the largest in Europe.

We have even close to launching a legally-oriented competitions on data analysis — at least, our team thought about this. Such competitions are held on the Kaggle platform. Participants are encouraged to build a machine learning model on the data provided by the customer, after which the authors of the most efficient algorithms receive cash prizes.

In the future, the ML system will be so advanced that will be able to automatically analyze the actions of the company or individual on the compliance of the legal field. However, before so bright a future we are still far.

Materials on the subject:

We did an AI algorithm to help merchandisers. Here’s how it works

Why fail projects on machine learning

Who can make the most out of artificial intelligence: corporations, startups, country?

The program is a lawyer: how artificial intelligence is helping the Prosecutor

Guide for LegalTech startups: where to go, what direction to develop and what to consider

Cover photo: Unsplash

Read more •••


Please enter your comment!
Please enter your name here