recent developments in patents statistics and data bases at epo and oecd epip – bocconi february...
TRANSCRIPT
Recent developments in Recent developments in patents statistics and data bases patents statistics and data bases
at EPO and OECDat EPO and OECD
EPIP – BocconiFebruary 24-25, 2006
Dominique GuellecOECD
Structure of the presentation
1. Patents databases for statistical uses: Patstat
2. Patent indicators for macroeconomic analysis: Patent families
Patents databases: Current situation
Difficulty to access patent data; each analysts/researcher to set up his/her own database extracted from data published by patent offices
=> high cost, duplication of costs,
=> uneven quality,
=> absence of standardisation,
=> lack of transparency.
Purpose of Patstat
Patstat is a response to the current needs. It is a database of patents designed for serving statistical purposes => format compatible with SQL, SAS etc.
Could be used for compiling indicators or conducting analytical work (policy, academic)
Contents
• Documentation: Data coming from 73 offices world wide, since beginning of the 20th century for certain offices.
• Post grant data: About 40 offices.
Variables in Patstat
• Application information (dates, numbers)• Applicants information• Inventors information• Priorities information• IPC classes information• National patent classes information• Publications information• References (citations) information (about 10 countries)• Licence information• Entry into force information (by country) • Lapse information (by country)
Cleaned names
• Current effort sponsored by Eurostat for cleaning the name of applicants at EPO and USPTO (correcting misspellings etc.).
• Cleaned names will be made available in Patstat.
Patstat sources
EPO sources:• DocDB• PRS• EPASYS• CDS
Other sources:• US publications• EUROSTAT name mapping• ...
An evolving product
• Will adapt to needs expressed by users
• More variables could be added, e.g. procedural data in EPO etc.
• Construct various tools for manipulating the data and complementary tables (e.g. families, citations)
Contribution of users
• Checking quality (more than 50 million records) => reporting defaults to EPO!
Further needs:
• Cleaning names for non western companies (Asia)
• Cleaning SMEs names
• Consolidating groups of enterprises
Conditions of access
• The first complete version is to be issued early April 2006. Then twice updates per year.
• Available to all users committing to non commercial use and no further dissemination of the data.
A hub
Patstat will find its place in the growing industry of patents databases: due to its harmonised priority numbers, it could be used as a pivot to match data from various patent offices – hence allowing “harmonised diversity”.
Patent indicators
As an extremely rich source of information, patents can be used as indicators reflecting the technological activity of countries => location of R&D, circulation of knowledge, co-operation in R&D, specialisation, technological performance etc.
BUT... possible noise and biases in the data make necessary elaborated filtering.
Sources of noise and bias
• Patents are complex entities... various types of titles (e.g. applications vs. grants, priorities vs. divisionals), cross country differences and changes over time in legal systems.
• Heterogeneity in value (highly skewed distribution).
• Patenting strategy of companies create distortions in the data (e.g. cross industry differences in propensity to patents, home bias etc.) => patent data reflect competitive strategy rather than just technology?
Country shares of patents applied for at the EPO and patent grants by the USPTO for priority year 1997
(Source: OECD)
16.7 20.7
28.7
52.8
46.6
16.4
8.0 10.1
0
20
40
60
80
100
EPO USPTO
%
Other countries
Japan
United States
European Union
Grants or applications?
Country shares in JPO patents, 2004, % Applications Grants Europe 4.8 4.3 US 5.2 3.8 Japan 83.0 90.7 Others 7.0 1.2
One candidate as a solution: Triadic families
• A patent family is a set of applications or patents filed in different offices to protect a same invention.
• A Triadic family (OECD definition) is a set of applications at the EPO and JPO and grants by USPTO which share one or more priorities.
Advantages of patent families
Address two issues in patent counts
=> heterogeneity in value
=> cross country biases
Heterogeneity in value
• Patents filing is costly (fees, translation, attorney, enforcement) => applicants are selective: Filing in several jurisdiction should be justified by expected value.
• Members of triadic families are more cited than other patents, have more claims etc.
Home advantage
• Families are measured on a more neutral ground than applications filed in a single jurisdiction.
Countries shares in patents indicators Priority year 1999, % (Source: OECD)
17.426.6 20.6
27.8
34.0 52.6
46.532.4
15.9
8.3 7.0 10.9
0
20
40
60
80
100
EPO Triadic patent families USPTO
% Japan United States European Union Other countries
Technical problems in compiling patent families
• No one to one correspondence between filings in different countries (e.g. two JPO priorities will make one USPTO application and one EPO application), plus problem with divisionals etc.
=> family counts could be biased if one counts ALL priorities.
• OECD solution = "consolidation": All applications sharing one or more priorities are counted as ONE family.
The impact of consolidation on family number(source: OECD)
0
10 000
20 000
30 000
40 000
50 000
60 000
1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997
Basic patent families A
Consolidated patent families A*
Consolidationfilter