scoda company networks2

39
Some slide prompts to support a data framing inves3ga3on around corporate data – originally prepared for the OGP Fes3val, London, October 2013. For more informa3on, contact: schoolOfData.org 1

Upload: tony-hirst

Post on 27-Jan-2015

103 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Scoda company networks2

Some  slide  prompts  to  support  a  data  framing  inves3ga3on  around  corporate  data  –  originally  prepared  for  the  OGP  Fes3val,  London,  October  2013.  

For  more  informa3on,  contact:  schoolOfData.org  

1  

Page 2: Scoda company networks2

When  I  buy  something  from  a  Shell  petrol  sta3on,  who  do  I  enter  into  a  transac3on  with?  

When  Shell  builds  a  new  petrol  sta3on,  who  owns  it?  

When  Shell  enters  into  a  new  extrac3on  contract,  who  actually  enters  the  contract?  

What,  exactly,  is  this  thing  we  think  of  as  the  company  we  refer  to  as  “Shell”?  

2  

Page 3: Scoda company networks2

It’s  a  sprawl….  

A  complex  network  of  interconnected  companies  with  intertwined  ownership  structures  and  registered  addresses  in  a  wide  range  of  countries  spread  across  the  globe.  

This  diagram,  taken  from  OpenCorporates,  shows  companies  for  which  Royal  Dutch  Shell  is  a  beneficial  owner,  as  well  as  beneficial  ownership  and  shareholder  rela3onships  those  subsidiary  companies  have  with  other  companies.  

3  

Page 4: Scoda company networks2

This  map  shows  companies  in  a  corporate  sprawl  grown  out  from  Royal  Dutch  Shell.  

In  this  case,  companies  are  connected  if  they  share  a  common  director,  rather  than  a  shareholding  or  ownership  rela3on.  

Star3ng  with  a  seed  company,  we  look  for  other  companies  that  share  two  or  more  directors  with  the  parent  company.  For  each  of  these  companies,  we  look  up  their  directors  and  repeat,  looking  for  further  companies  co-­‐directed  by  two  or  more  members  of  this  extended  set  of  directors.    

The  network  is  organised  so  that  companies  connected  by  several  directors  are  posi3oned  closely  together.  The  result  is  a  network  map  where  different  groups  –  or  clusters  of  companies  –  that  share  common  directors  tend  to  neighbour  each  other.  

The  size  of  the  company  name  is  broadly  related  to  how  ’influen3al’  the  company  is  in  the  network  based  on  the  extent  to  which  it  is  connected  to  other  influen3al  companies.  

Reading  the  map  as  if  it  were  a  gegraphical  map,  we  see  concentra3ons  of  different  companies  opera3ng  in  related  sectors,  presumably  as  a  result  of  par3cular  directors  specialising  in  certain  areas  of  the  business.  

Note  the  presence  of  BP  in  there,  even  though  this  network  was  grown  out  of  the  seed  company  Royal  Dutch  Shell  Plc.  –  somehow,  these  two  groupings  are  connected  by  shared  directorships  of  some  intermediate  company.  

4  

Page 5: Scoda company networks2

This  map  shows  a  different  view,  concentra3ng  on  how  directors  are  connected  by  virtue  of  being  co-­‐directors  of  the  same  company  or  companies.  

5  

Page 6: Scoda company networks2

These  maps  are  all  built  from  data    -­‐  company  data  –  but  where  can  we  find  this  data?  And  how  can  we  start  to  work  with  it  /as  data/  ourselves?  

To  explore  that,  let’s  first  think  about  what  we  mean  by  company  data,  before  looking  at  strategies  for  discovering  or  

6  

Page 7: Scoda company networks2

At  their  heart,  complex  network  visualisa3ons  can  be  constructed  from  quite  simple  stated  data  sets,  such  as  this  one.  

Each  row  represents  a  connec3on    -­‐  or  link  –  between  two  companies,  with  a  “directed  edge”,  that  is,  an  arrow,  going  from  the  Source  element  to  the  target  element.  

Loading  a  simple  CSV  (comma  separated  variable)  text  based  data  file  such  as  this  into  a  network  visualisa3on  tool  provides  enough  informa3on  to  the  tool  to  allow  it  to  work  out  the  connec3ons  between  all  the  elements  and  them  plot  the  corresponding  network  diagram.  

7  

Page 8: Scoda company networks2

If  you  would  like  to  learn  more  about  genera3ng  network  visualisa3ons  using  Gephi,  there  are  several  tutorials  available.  

For  example:  hep://schoolofdata.org/2013/03/14/first-­‐steps-­‐in-­‐iden3fying-­‐climate-­‐change-­‐denial-­‐networks-­‐on-­‐twieer/  hep://blog.ouseful.info/2012/11/09/drug-­‐deal-­‐network-­‐analysis-­‐with-­‐gephi-­‐tutorial/  

8  

Page 9: Scoda company networks2

Launch  Gephi  and  create  a  new  project.  In  the  data  laboratory,  select  “Import  spreadsheet”  and  load  in  a  CSV  file  containing  a  list  of  connected  companies,  one  pair  of  companies  (that  is,  one  pair  of  conncected  companies)  per  row,  

Note  that  the  data  file  must  contain  one  column  called  Source  and  one  called  Target.  Also  make  sure  that  you  have  iden3fied  the  file  as  being  an  Edge  Table.  

9  

Page 10: Scoda company networks2

The  controls  are  slightly  too  involved  to  go  into  here,  but  once  you’ve  loaded  in  the  data  into  Gephi,  you  can,  with  prac3ce,  very  quickly  (2-­‐3  minutes  in  all)  generate  interac3ve  visualisa3ons  such  as  the  one  shown  here.  

10  

Page 11: Scoda company networks2

Here’s  a  reminder  of  a  couple  of  tutorials  to  get  you  started:  

hep://schoolofdata.org/2013/03/14/first-­‐steps-­‐in-­‐iden3fying-­‐climate-­‐change-­‐denial-­‐networks-­‐on-­‐twieer/  hep://blog.ouseful.info/2012/11/09/drug-­‐deal-­‐network-­‐analysis-­‐with-­‐gephi-­‐tutorial/  

11  

Page 12: Scoda company networks2

So  given  some  data  represented  in  quite  a  simple,  two  column  text  based  data  file,  we  can  create  quite  rich  and  complex  network  visualisa3ons  of  our  own.  

But  where  can  we  get  the  data  from  in  the  first  place?  

12  

Page 13: Scoda company networks2

OpenCorporates  is  a  private  company  that  has  set  itself  the  ambi3ous  task  of  building  a  database  of  registered  company  informa3on  for  every  legal  corporate  en3ty  in  the  world.  

13  

Page 14: Scoda company networks2

One  of  the  views  OpenCorporates  offers  over  at  least  some  of  the  data  in  its  database  shows  how  companies  are  connected  by  beneficial  ownership  or  shareholder  rela3onships.  

Although  complex,  this  diagram  is  “human  readable”  –  the  data  is  presented  in  a  way  that  is  intended  to  make  some  sort  of  meaningful  sense  to  us.  

14  

Page 15: Scoda company networks2

But  as  well  as  publishing  data  for  us  humans  to  read,  OpenCorporates  also  makes  data  available  in  a  way  that  machines  can  read    -­‐  machine  readable  data.  

You  may  have  heard  of  the  term  “API”  in  the  context  of  data  publishing  websites.  To  all  intents  and  purposes,  an  API  is  an  interface  that  computers  can  use  to  get  informa3on  out  of  websites  in  a  way  that  they,  and  the  databases  they  work  with,  can  understand.  

15  

Page 16: Scoda company networks2

If  you  aren’t  a  programmer,  here’s  way  of  gekng  the  data  out  of  OpenCorporates  and  into  a  tabular  form  you  may  be  more  comfortable  with,  and  which  we  can  use  to  generate  a  network  diagram  to  display  in  a  tool  such  as  Gephi…  

16  

Page 17: Scoda company networks2

Using  the  web  address  –  or  URL  –to  web  page  that  reveals  the  data  used  to  publish  a  corporate  ownership  network  on  OpenCorporates,    we  can  load  the  data  in  to  OpenRefine.  

Note  that  you  can  import  data  into  OpenRefine  from  several  web  addresses  all  in  one  go,  though  the  data  returned  from  each  URL  should  have  the  same  format  or  structure.  

Using  mul3ple  URLs  results  in  a  combined  data  set,  which  can  be  quite  handy.  

17  

Page 18: Scoda company networks2

Being  machine  readable,  the  data  makes  more  sense  to  OpenRefine  than  it  probably  does  to  us!    

Select  a  block  of  data  in  the  preview  view  that  is  typical  of  a  set  of  data  that  you  want  to  map  into  a  single  row  in  a  “tradi3onal”  spreadsheet  like  view.  

Data  blocks  are  typically  contained  within  braces  (curly  brackets);  these  things  :  {  }  

Note  that  in  some  machine  readable  data,  some  data  blocks  may  be  contained  within  other  data  blocks…  

Each  of  the  items  in  a  single  data  block  can  be  mapped  into  a  separate  cell  –  that  is,  a  separate  column  –  in  a  single  row  of  data.  

So  each  data  block  is  a  row,  and  each  item  in  the  block  is  a  column….  OpenRefine  will  give  you  a  preview  of  how  the  data  will  look  if  you  click  the  right  bueon!  

18  

Page 19: Scoda company networks2

Once  we’re  happy  with  the  data  preview,  we  can  import  the  data  into  a  more  familiar  looking  layout.  

The  arrows  at  the  top  of  each  column  pop  up  menus  that  allow  us  to  run  a  wide  variety  of  opera3ons  on  a  column.  

One  of  the  opera3ons  let’s  us  change  the  column  name,  so  I’m  going  to  rename  the  child  company  and  parent  company  columns  to  Source  and  Target.  

19  

Page 20: Scoda company networks2

We  can  now  export  the  data  using  the  Custom  Tabular  Exporter.  

Then  from  the  Download  tab,  select  the  CSV  output  type  and  export  your  data.  

You  should  have  the  two  column  data  you  can  now  load  in  to  Gephi.  

20  

Page 21: Scoda company networks2

OpenRefine  is  a  very  powerful  tool  for  working  with  data  sets.  

It  can  be  used  to  help  harvest  informa3on  from    other  websites  by  loading  in  data  from  every  URL  contained  within  a  par3cular  column  (par3cularly  machine  readable  data  from  web  addresses  that  make  call  to  website  APIs).  

21  

Page 22: Scoda company networks2

OpenRefine  also  provides  tools  for  cleaning  data  within  a  column  (for  example,  changing  everything  to  UPPERCASE  or  Title  Case),  removing  unwanted  punctua3on,  or  replacing  one  phrase  with  another  (such  as  replacing  Ltd  with  Limited).  

If  you  have  a  column  containing  data  elements  at  least  some  of  which  are  supposed  to  match,  or  be  consistent,  but  which  aren’t,  several  clustering  tools  may  be  able  to  help.  

For  example,  we  may  recognise  Royal  Dutch  Shell  PLC,  ROYAL  DUTCH  SHELL  P.L.C.,  and  Royal  Dutch  Shell  as  represen3ng  the  same  thing,  but  a  computer  will  treat  them  all  as  different  companies.  

The  clustering  tools  will  aeempt  to  group  together  items  that  resemble  each  other  in  some  way  and  provide  with  with  the  op3on  of  rewri3ng  all  the  different  flavours  in  the  same  way  (for  example,  all  the  above  examples  as:  Royal  Dutch  Shell  plc)  

22  

Page 23: Scoda company networks2

That  said,  if  you’re  keen  to  learn  more  about  OpenRefine,  here  are  some  tutorials  to  get  you  started:  

hep://schoolofdata.org/2013/10/18/in-­‐support-­‐of-­‐the-­‐bangladeshi-­‐garment-­‐industries-­‐data-­‐expedi3on/  hep://schoolofdata.org/handbook/recipes/cleaning-­‐data-­‐with-­‐refine/  hep://blog.ouseful.info/2013/03/14/first-­‐dabblings-­‐with-­‐the-­‐gateway-­‐to-­‐research-­‐api-­‐using-­‐openrefine/  hep://blog.ouseful.info/2013/05/01/a-­‐simple-­‐openrefine-­‐example-­‐3dying-­‐cutnpaste-­‐data-­‐from-­‐a-­‐web-­‐page/  hep://blog.ouseful.info/2013/05/03/a-­‐wrangling-­‐example-­‐with-­‐openrefine-­‐making-­‐ready-­‐data/  hep://blog.ouseful.info/2013/06/15/working-­‐jobs-­‐data-­‐with-­‐openrefine/  hep://schoolofdata.org/2013/07/26/using-­‐openrefine-­‐to-­‐clean-­‐mul3ple-­‐documents-­‐in-­‐the-­‐same-­‐way/  hep://schoolofdata.org/2013/06/04/analysing-­‐uk-­‐lobbying-­‐data-­‐using-­‐openrefine/  hep://blog.ouseful.info/2013/10/10/screenscraping-­‐html-­‐web-­‐pages-­‐with-­‐openrefine-­‐norwegian-­‐oil-­‐company-­‐data/  (advanced)  

23  

Page 24: Scoda company networks2

We’ve  started  to  see  how  we  can  get  machine  readable  out  of  OpenCorporates  and  in  to  another  set  of  tools  that  we  can  then  use  to  start  to  analyse  the  data.  

So  what  other  data  can  we  get  out  of  OpenCorporates?  

Note  that  while  what  follows  mainly  focusses  on  what  machine  readable  data  we  can  get  from  OpenCorporates,  and  where  we  can  get  it  from,  it’s  worth  also  bearing  in  mind  higher  level  journalis3c  or  inves3ga3ve  ques3ons,  such  as:  what  sorts  of  structures  or  rela3onships  might  we  be  able  to  discover  by  analysing  this  data?  

24  

Page 25: Scoda company networks2

Looking  at  web  pages  on  OpenCorporates  provides  us  with  a  human  readable  view  of  the  data.  But  if  we  want  to  start  developing  our  own  corporate  maps  or  looking  for  connec3ons  across  hundreds  of  companies,  it’s  oqen  easier  to  let  a  machine  handle  the  task.  

Let’s  look  again  at  a  company  page  on  OpenCorporates.  We  can  see  company  informa3on  rela3ng  to  a  par3cular  company  is  contained  on  the  page  in  a  human  readable  way.  

Look  at  the  web  address  –  or  URL    -­‐  of  the  page,  and  compare  it  to  the  company  informa3on–  do  you  recognise  any  pieces  of  the  address  in  the  company  data?  

The  web  address  actually  contains  the  jurisdic3on  and  the  company  iden3fier  for  the  company  shown.  So  if  we  know  the  jurisdic3on  and  the  company  number,  we  can  get  to  this  page  (or  the  human  readable  version  –  just  cut  api.  off  the  front  of  the  web  address).  And  if  we  can  get  to  this  page,  we  can  find  addi3onal  informa3on  about  it,  such  as  the  registered  name  of  the  company,  or  it’s  registered  address.  

To  learn  more  about  reading  and  wri3ng  web  addresses,  or  URLs  as  they  are  also  known,  see:  hep://schoolofdata.org/2013/05/09/hun3ng-­‐for-­‐data-­‐learning-­‐how-­‐to-­‐read-­‐and-­‐write-­‐web-­‐addresses-­‐aka-­‐urls/  

But  what  if  we  wanted  to  pull  this  informa3on  in  to  OpenRefine?  Is    the  data  for  a  par3cular  company  available  in  a  machine  readable  way?  

25  

Page 26: Scoda company networks2

If  we  look  behind  a  company  web  page  for  a  company  listed  on  OpenCorporates,  we  can  see  the  company  data  in  a  machine  readable  way.  To  get  this  view,  simple  add  api.  to  the  start  of  the  web  address  of  the  company  page.    

If  you  pluck  up  courage  to  look  at  the  data,  you’ll  see  that  you  can  start  to  make  sense  of  it.  What  is  the  company  name,  for  example?  When  was  it  incorporated?  Where  is  its  registered  address?  What  is  the  company  number,  and  which  jurisdic3on  does  it  apply  to?  

26  

Page 27: Scoda company networks2

The  OpenCorporates  company  data  feed  for  a  company  (and  the  associated  human  readable  web  page)  also  contains  a  wealth  of  other  informa3on  presented  in  a  form  that  computers  can  read  –  and  manipulate  (we’ll  see  an  example  of  that  shortly…).  

In  this  case  we  can  see  a  par3al  lis3ng  of  the  officers  of  the  company  (that  is,  the  directors),  along  with  their  appointment  date,  their  current  status,  and  the  termina3on  date  of  the  appointment,  if  appropriate.  

The  important  thing  to  realise  is  that  in  the  same  way  we  can  make  sense  of  and  read  normal  web  pages,  computers  can  make  sense  of  and  read  structured,  machine  readable  data  such  as  this,  as  well  as  storing  it  in  databases  that  allow  us  to  search  for  paeerns  and  structures  across  large  amounts  of  data  in  a  rela3vely  straighsorward  way.  

27  

Page 28: Scoda company networks2

Here’s  a  recap  of  a  couple  of  examples  web  addresses/URLs  for  machine  readable  data  feeds  rela3ng  to  companies  on  OpenCorporates.  

Can  you  see  how  to  hack  (that  is,  edit,  or  modify)  the  URLs  to  give  the  human  readable  web  page  for  the  corresponding  company?  

Could  you  see  how  to  change  the  URL  to  give  a  web  page  (even  in  machine  readable  form  or  human  readable  form)  for  another  company  in  the  same  jurisdic3on?  How  about  for  a  company  in  another  jurisidic3on?    

28  

Page 29: Scoda company networks2

As  well  as  looking  up  informa3on  about  a  par3cular  company  on  a  web  page  whose  address  includes  the  jurisdic3on  and  company  number  of  the  company  whose  details  we  want  to  look  up,  we  can  also  search  for  companies  on  OpenCorporates.  

If  you  look  at  the  web  address,  you  should  be  able  to  see  that  the  search  term  appears  in  it,  although  there  also  appears  to  be  some  gibberish  characters  (these  characters  aactually  represent  –  or  encode  –  the  blank  space  characters  in  the  search  term  that  appears  in  the  search  box.  

If  I  tell  you  that  a  machine  readable  version  of  the  search  results  is  available  in  a  data  form,  could  you  guess  at  how  to  hack  the  URL  in  order  to  reveal  this  data?    

29  

Page 30: Scoda company networks2

As  well  as  providing  machine  readable  versions  of  company  informa3on,  via  web  addresses/URLs  that  contain  key  jurisdic3on  and  company  iden3fier  numbers  that  uniquely  describe  the  corresponding  company,  OpenCorporates  also  provides  a  machine  readable  version  of  a  search  made  on  company  name,  reachable  via  a  similar  tweak  to  the  we  address  that  gave  a  machine  readable,  compared  to  human  readable,  version  of  a  single  company  page.  

Given  the  URL,  do  you  think  you  would  be  able  to  pull  this  machine  readable  data  into  a  tool  such  as  OpenRefine?  

30  

Page 31: Scoda company networks2

As  well  as  the  “simple”  search  for  companies  by  name,  OpenCorporates  offers  another  form  of  search  that  can  be  more  3ghtly  integrated  with  OpenRefine  

It’s  known  as  a  reconcilia3on  API,  and  it’s  used  to  try  to  find  matches  for  company  names  in  the  OpenCorporates  database,  along  with  an  es3mate  about  how  confident  the  match  is.  

For  example,  if  you  have  a  list  of  company  names  in  a  single  column,  you  can  use  the  reconcilia3on  API  in  an  aeempt  to  get  matches  in  OpenCorporates  for  all  those  companies.  

This  works  similar  to  search,  but  we  also  get  a  confidence  score  back  es3ma3ng  how  well  the  company  name  suggested  by  OpenCorporates  matches  the  one  you  provided.  

Tools  such  as  OpenRefine  can  hook  in  to  the  reconcilia3on  API  and  pitch  a  company  name  to  it,  and  the  API  will  give  a  set  of  best  guess  matches  of  company  names  OpenCorporates  knows  about  along  with  a  confidence  es3mate.  (You  can  also  limit  aeempts  at  reconcilia3on  to  just  those  companies  within  a  par3cular  jurisdic3on).  

When  OpenRefine  gets  the  sugges3ons  back,  you  can  choose  to  accept  the  most  confident  matches,  or  matches  above  a  certain  confidence.  The  result  is  an  automa3cally  enriched  data  set  that  includes  OpenCorporates  iden3fiers  and  registered  names  for  each  of  the  companies  in  your  original  list.  

For  more  on  reconciling  company  names  with  OpenCorporates  company  lis3ngs,  see:  hep://schoolofdata.org/2013/10/18/in-­‐support-­‐of-­‐the-­‐bangladeshi-­‐garment-­‐industries-­‐data-­‐expedi3on/  

31  

Page 32: Scoda company networks2

Just  so  you  know  what  it  looks  like,  (because  you  should  be  gekng  your  eye  in  now…)  here’s  an  example  of  the  data  returned  by  the  OpenCorporates  reconcilia3on  API.    Note  the  confidence  score  associated  with  each  item  in  the  response  list.  

You  may  no3ce  that  the  web  address  for  this  API  call  has  a  slightly  different  form  to  the  other  API  calling  web  addresses  you  have  seen.  You  should  also  note  that  no  human  readable  web  page  version  corresponding  to  this  data  exists.  

Of  course,  as  well  as  calling  the  reconcilia3on  API,  if  you  can  construct  the  web  address/URL  yourself,  you  can  also  load  data  into  a  new  OpenRefine  project  from  that  address  directly.  

32  

Page 33: Scoda company networks2

Here’s  a  summary  of  the  search  and  reconcilia3on  API  web  addresses.  

Do  you  think  you  could  hack  the  URLs  to  run  searches  for  other  company  names?  

33  

Page 34: Scoda company networks2

As  well  as  informa3on  about  companies,  OpenCorporates  also  try  to  collect  informa3on  about  company  directors.  This  includes  full  name  as  well  as  the  appointment  and  termina3on  date  (if  appropriate)  of  the  appointment.  

If  we  are  collec3ng  data  around  a  par3cular  company,  one  thing  we  might  look  at  is  how  directorial  appointments  changes  over  3me  in  the  various  companies  associated  with  a  par3cular  grouping,  as  well  as  the  dynamics  of  how  companies  are  formed  and  dissolved.  

At  the  moment,  a  separate  page  exists  for  each  person-­‐who-­‐is-­‐a-­‐director-­‐of-­‐a-­‐par3cular-­‐company.  If  the  same  human  is  a  director  of  two  companies,  that  human  will  be  associated  with  two  separate  director  numbers  (one  corresponding  to  the  directorship  of  the  first  company,  a  second  corresponding  to  their  directorship  of  the  other  company).  

Hopefully,  in  3me  we  also  get  a  unique  iden3fier  for  a  par3cular  person,  and  a  mapping  to  the  director-­‐company  iden3fiers  associated  with  pages  such  as  the  one  shown  here  that  describe  a  par3cular  director  role  with  a  specific  company.  

34  

Page 35: Scoda company networks2

As  with  company  informa3on  pages,  we  can  also  get  a  peek  behind  the  scenes  at  the  data  associated  with  a  par3cular  director  in  the  context  of  a  par3cular  company.  

Given  the  URL  of  a  human  readable  company  director  web  page,  do  you  think  you  could  get  hold  of  the  data  version?  

35  

Page 36: Scoda company networks2

As  well  as  searching  for  companies,  we  can  also  search  for  directors.  As  before,  if  you  look  at  the  web  address  for  the  search,  you  may  be  able  to  recognise  that  the  search  term  that  appears  in  the  search  box  on  the  web  page  also  appears  in  the  URL  that  acts  as  the  web  address  for  that  human  readable  web  page.  

What  happens  if  you  filter  the  results  by  jurisdic3on?  Click  on  one  of  the  jurisdic3on  links  and  then  look  at  the  URL  of  the  page.  Do  you  no3ce  anything  different  about  it?  

(The  same  trick  works  for  searches  on  companies…)  

Knowing  what  you  know  about  the  OpenCorporates  API,  could  you  guess  at  a  URL  that  might  return  a  data  version  of  this  page?    

36  

Page 37: Scoda company networks2

A  slight  tweak  the  URL  of  the  human  readable  search  page,  and  we  get  the  machine  readable  data.  

Do  you  think  you  would  be  able  to  find  a  way  of  pulling  data  rela3ng  to  a  search  on  a  par3cular  director  in  to  OpenRefine?  

Do  you  think  you  could  hack  the  URL  to  pull  data  about  a  search  for  a  par3cular  director  name  limited  to  a  search  for  appointments  within  a  par3cular  jurisdic3on  into  OpenRefine?  

Recalling  that  OpenRefine  can  load  in  data  from  several  URLs,  do  you  think  you  could  find  a  way  of  pulling  data  into  OpenRefine  from  searches  within  a  par3cular  territory  on  a  director’s  name  with  two  different  spellings,  or  the  same  director’s  name  in  two  different  jurisdic3ons?  

37  

Page 38: Scoda company networks2

Here’s  a  quick  recap  of  how  to  pull  data  out  of  OpenCorporates  that  relates  to  a  search  for  a  par3cular  director,  and  how  to  further  limit  the  search  for  appointments  made  to  companies  registered  within  a  par3cular  jurisdic3on.  

Note  that  there  is  no  reconcilia3on  API  for  director  names.  

38  

Page 39: Scoda company networks2

If  you  want  to  know  more,  contact  us…  

39