full text search with app engine search api and django

10
Full text search with App Engine Search API and Django Kunal Grover

Upload: kunal-grover

Post on 12-Apr-2017

128 views

Category:

Software


3 download

TRANSCRIPT

Page 1: Full text search with App Engine Search Api and Django

Full text search with App Engine Search API and Django

Kunal Grover

Page 2: Full text search with App Engine Search Api and Django

Why use Google App Engine?- Super fast prototyping and hitting the market- Auto scaling- Supports Django very well- MySQL, NoSQL, Memcache, Cloud storage ready to set up- Cheap- Lot of Google APIs to use

Page 3: Full text search with App Engine Search Api and Django

Adding Full Text Search- MySQL - Not recommended - External Solr server - Great performance!, but $$$$ �

- Google App Engine Search

GAEExternal SOLR

server(AWS/Google Cloud platform)

GAE Instance

Google Search Index

Service

Page 4: Full text search with App Engine Search Api and Django

The GAE Search API- Powerful expression based search- Supports TextField, DateField, HtmlField and NumberField- Fuzzy logic based ranking, add your own ranking setup!- But?

Google Search Index

ServiceGAE Instance

Object created

Object updated

Object deleted

Page 5: Full text search with App Engine Search Api and Django

What’s missing?- Batch updates recommended - Store the updates for each model in the

Database.- Need to create a service that updates each of the search documents

whenever they are updated/created/deleted.(Visualize it as lot of repeated code)

- Use incorrect types makes searching harder.- Let’s bring it to Django-Land.

Page 6: Full text search with App Engine Search Api and Django

- Install google-appengine-django-search- Connect to a model:

siteIndex.register(ModelName, field_list, rank_function)

- Set up a URL handler for Indexerhandlers:

- url: /index

script: searchApp.apps.app

- And set up a Cron job:cron:

- description: Index ranking

url: /index

schedule: every 5 minutes

Page 7: Full text search with App Engine Search Api and Django

siteIndex.register(ModelName, field_list, rank_function)

- ModelName -> Any Model class - Field_list -> [‘fieldname’, ‘fk.fieldname’, ‘fk.fk.fieldname’, ‘manyToManyField’]- Rank_function -> Score for an object

Default behaviour:

- Model create/updates -> Added to search update queue- Model deletes -> Add to search delete queue- IntegerField stored as search.NumberField, DateField stored as

search.DateField.

Page 8: Full text search with App Engine Search Api and Django

For the Power User- Option to set type as search.HtmlField - Search only the text- Using soft deletes? -> Publish your custom delete signal

siteIndex.register(ModelName, field_list, rank_function,

deleteSignal=customDeleteSignal)

- Foreign keys don’t update search documents by default? (Sorry limitation from my side)-> Publish your custom update signal siteIndex.register(ModelName, field_list, rank_function,

updateSignal=customUpdateSignal)

- Don’t want to use the cron setup? -> Use index_create_single(obj) and

index_delete_single(obj)

Page 9: Full text search with App Engine Search Api and Django

What’s in for future?- Better ManyToManyField support? - Attach listeners to child models auto save too.- Make it work with Google Managed VM environment too.- Optimize the index_create_single(obj) to use Task Queues.

Page 10: Full text search with App Engine Search Api and Django

Demotags: Horror NOT tags: Romance https://django-test-143704.appspot.com/search?search_type=Book&query=tags%3A+Horror+NOT+tags%3A+Romance&start=0&end=5

date < 2005-04-01 https://django-test-143704.appspot.com/search?search_type=Book&query=date+%3C+2005-04-01&start=0&end=5

description: novel https://django-test-143704.appspot.com/search?search_type=Book&query=description%3A+novel&start=0&end=5

author_name=Jean https://django-test-143704.appspot.com/search?search_type=Book&query=author_name%3A+Jean&start=0&end=5

author_publisher_name=Atlas Press https://django-test-143704.appspot.com/search?search_type=Book&query=author_publisher_name%3A+Atlas+Press&start=0&end=5