-
New index
This document show how to use a search definition to create and update an index. The current version of the definition will allow you to have one title that is stored, one url that is unique that can be generated using record information and LiveText and content that only are indexed but not stored. Tables and fields to index is sorted in the definition given to the indexer.
The defintion
The definition going into SeatchBase as seachhdef is a dict with at leas one key named 'searchtables' that should be a tuple or list. In this list you should add lists or tuples of string. The example below will have all fields for the first added. You can index as many tables or parts of tables as you like in one index as long as you are able to create unique keys for the url field.
0 TableName The name of the table where we find the fields to index 1 Title The field with info to store and return when you get a hit in the index 2 Content The fields to use for content in the index. If you want to add more than one field, you can separate them by ;. In the index each field will be read and added to the content string seperated by a space. 3 Url This will be used for the unique key in the index and should also be how you can retrieve your document when you get a hit in the search. This have to be a unique value as it also is used for the unique key in the index. It should not change between updates as that will produce double entries and missing document when searched. You can used LiveText to add data from the current record, so if you have unique keys in your table you can used these.
Ex. '/viewdoc.html/?&cp=[{ChapterPrefix}]&ps=[{PageSlug}]'4 UpdateDate The date the record was last updated, used for the refreshindex routine. 5 UpdateTime The time the record was last updated, used for the refreshindex routine. 6 CreateDate The create date, used to see if the document is new, used for the refreshindex routine. 7 CreateTime The create time, used to see if the document is new, used for the refreshindex routine. 8 Query The query used to find documents in the table A dict might be added as a option at a later point in time, but the array of array work now,
Example of indexing a Document table
from debdata3 import getrepo, uninitprism9 from debsearch3.searchbase import SearchBase import os updatedone = False searchroot = "C:\\\path\\to\\your\\search\\index\\root" searchname = "searchname" repopath = "C:\\\path\\to\\your\\database\\to\\search" reponame = "YourAppName" searchdef= { 'searchtables' : [ [ 'Document', 'Title', 'Title;Body', '/viewdoc.html/?&cp=[{ChapterPrefix}]&ps=[{PageSlug}]', 'UpdatedDate', 'UpdatedTime', 'CreatedDate', 'CreatedTime', 'BookPrefix="yourbookprefix"', ], ], } try: repo = getrepo(repopath, reponame) search = SearchBase(searchname, searchroot, searchdef) if os.path.exists(search.indexpath): search.refreshindex() else: search.createindex() search.buildindex() print ("Records added: %d updated: %d" % (search.addcount,search.updatecount)) if updatedone: search.commit() uninitprism9() print("Done") except Exception as e: print(e)
This example reads all documents found by a query and add then to the index. Create time and date combined with updated time and date will keep the index up to date.
The example works by first opening the repo where we are going to query for document as that will then be used by the indexer. Then we create a instance of SearchBase with index name, the root path to where we store our indexes and the definition that is used by the indexer.Then we check if it exist and create it without any test if not. If it exist, we do the refreshindex that used the date fields to figure out what is new or need updating.