Patents are complicated publications that contain many data elements that are essential for describing the invention. There are many active patents from different national patent offices. Patent documents themselves contain a lot of data: priority date, inventors, classifications, applicants and claims (just to name a few).

Currently, searches for prior art and other patent searches are executed with composed keyword queries, which are not only time consuming, but also prone to errors.


It is common to conduct a search of free text data to identify those data records that satisfy a predefined query. Regardless of the data source, searches may be conducted to identify the data records that include one or more search terms identified by the query. The data records that are returned from a search may then be reviewed.
The quality of a search may be defined by its recall and its precision. Recall relates to the number or percentage of correct answers that are returned relative to all of the correct answers within the data source(s) that are searched. Searches that identify a greater percentage of the correct answers have a greater recall. Precision relates to the number or percentage of answers that are returned that are correct. Thus, searches that provide a greater percentage of correct answers have a greater precision.
Patent searchers try to achieve the highest level of precision and recall, without missing any relevant documents



Patent searching
Missing relevant patent documents during a search is not allowed because of the high commercial value of patents. Thus, it is important to retrieve all possible relevant documents rather than finding only a small subset of relevant patents.
But, even for only a few retrieved patent documents, it is not trivial to analyse the result.
For instance, the task of determining patentability involves analysing previous patent documents that possibly discloses novelty and/or inventive step.
Professional searchers know how to craft search strings using strategically selected keywords and Boolean operators. Analysing the desired outcome for the search and “synthesizing” a query to produce the desired results is an advanced skill that allows people to “speak” to a search engine. Owners of these skills are often professional (qualified?) patent searchers.

Database producers have worked hard over the years to improve searching, using natural language searching, semantic searching, and other techniques.
1998 was actually quite a year in the development of patent searching. Espacenet was introduced, and was launched as the first free and internet based patent search engine. And it was very useful – you could run a patent search by entering a set of keywords and other criteria, and it would return all patents that met this criteria
I agree that there remains a place for traditional keyword and class code searching. However, just like the world of web searching has provided other options, so has patent searching. AI was introduced in addition to “traditional searching”



Artificial intelligence
The generally accepted definition of Artificial Intelligence (AI) is the demonstration of intelligence by machines. More commonly, it’s a term that is used when we use a machine to mimic cognitive human functions such as learning and problem-solving.
And in a broad sense it can be any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.
AI has found application in a number of general areas like speech recognition and autonomous vehicles.
In the IP area Artificial intelligence is intended to reduce costs, creates stronger patents and more accurately determining the value of patents in the marketplace.


Patent searching and Artificial intelligence
Conventional patent searching relies on expert searchers who use keyword, class code etc., and reviewing the results produced by these terms – often which are ranked.
Historically, the approach to prior art searching is to determine a few keywords and classes from the patent application and use these keywords to retrieve published applications and patents.
Is this artificial intelligence? I would suggest not, instead it is relying on the intelligence of the searchers. The role of the patent database is to follow the instructions of these searchers.
However artificial intelligence is found in several other forms of patent searching. Semantic searching is one such system. Semantic searching is by no means new. Most semantic engines are generic text, or generic technology language engines.
Semantic analysis can be used where algorithms looks for similar language in other documents.
Text Mining is another search technology that can be used.
Text mining, also referred to as text data mining, more or less equivalent to text analytics, is the process of deriving high-quality information from text.

Variations of the class code information, maybe other undisclosed algorithms or citation analysis are other examples of using AI in patent searching.

In most cases AI approaches are designed to be very simple to use, and to be used alongside traditional searching methods – because no one technique may give you the perfect results. AI techniques can be very cost-effective because they can produce a list of results in an efficiently ranked series of results.


Resistance to AI in patent searching
Plenty of professional patent searchers are using AI patent searching as part of their set of tools, but some are reluctant to use these tools, instead sticking to conventional filter based searching.
Stated reasons for this including:
1. Searching by keywords and class code filtering is the way that things ‘have always been done around here’. ‘It is the safe approach’, and easy to describe to clients.
Answer: The purpose of patent searching is to find relevant patents, and not to run a patent search process the way it has always been run. By using a range of methods including AI patent searching, you can increase your change of finding relevant patents, and so meet your purpose – and improve the quality of your search.

2. AI searching can involve ‘black box’ methods, and we should not use black box approaches.
Answer: Whether we admit it or not, we use black box methods to search online all of the time. Not just Google, but the likes of Amazon, AirBnb and Ebay all use black box algorithms to provide a list of ranked options for you to peruse, when you use them to run a search of some sort. And, for example, who cares how ABS works in your car as long as it does its job.



3. What is wrong with searching by keyword filtering in any case?
Answer: Searching by keyword/class code filtering can work well in some cases. In other cases you will miss relevant patents with unexpected keywords or class codes. Because it bring up lots of low relevance patents to look through, it can be very time consuming to go through them all.

4. While searching for patents using the conventional method can be time intensive, we have plenty of time in our organisation, so what does it matter?
Answer: In reality, all of us are under time and cost constraints. Our clients and managers want the best possible results at the least possible cost.

Many patents searches are done by text searching and, to a lesser extent, by semantic search engines. The problem with such tools is that their broad searching capabilities return a lot of noise along with the useful results.
So the answer to the question “Have we moved beyond keyword and class code filtering in patent searching?” is yes! – Many searchers are incorporating AI search techniques into their search processes, through a variety of search processes and use them alongside traditional searching methods.
And the others? – They are still defining a high quality search using principles that not changed much since 1998 – back in the days when Espacenet was introduced.


Aalt van de Kuilen

Senior Patent Information Specialist in the field of Life Sciences and Chemistry. Read more about the background and expertise of Aalt van de Kuilen.