Thursday, May 22, 2014

Building a Search Index and Searching the Portal - VERITY Search Engine

What does Verity Search engine do?
How to build a Search Index?
What PORTAL_INDEX application engine program does?
How to retrieve content references path by searching the portal?
This post is an attempt to answer these questions.

Verity Search Engine:
PeopleSoft Portal technology provides a tremendous search feature using the Verity search engine. PeopleSoft Portal and Verity Search technologies combine and give portal users easy and efficient search on content references registered in the portal registry.

Registry Search Index:
The portal registry collections are generated by the portal administrator and stored on the application server. This can be either done manually or by a scheduled process. The application engine used to build the registry search index is PORTAL_INDEX. 

This process can be run manually from the navigation:
PeopleTools -> Portal -> Build Registry Search Index

Verity Collection files are stored on the application server in the location:
%PS_HOME%\data\search\< index_name >\< db_name >\ < lang_cd >

  • %PS_HOME% - the home directory in which PeopleSoft application server is installed
  • < index_name > - name of the application or the portal name for portal registry that the collection is serving (PORTAL)
  • < db_name > - as the name signifies name of the database
  • < lang_cd > - PeopleSoft local language code. A single portal will have only one collection per locale. Each holds the text files used to build the collection - input.bif and input.dat
Several numbers of subdirectories exist beneath the language code directory. These hold the various elements of actual collection used by Verity to conduct the search.

Running the PORTAL_INDEX application engine process builds a search collection which includes few elements from the content references in the registry and get included in the index.

We know that Content References can be accessed from 
PeopleTools -> Portal -> Structure & Content -> <(nested) folder_name> -> ContentReference 
  
PeopleSoft Component Content References have the following information: 

  • ICType
  • Menu
  • Market
  • PanelGroupName
iScript Component Content References have the following information: 

  • ICType
The following information is collected for all Content References: 

  • Label
  • Description
  • Author
  • Product
  • Valid from Date
  • Valid to Date
  • Creation Date
  • Content Provider
  • URL
  • Path
  • Attributes
In the Content Reference Attributes, a set of keywords are entered for PeopleSoft delivered content references. These keywords are specified in Attribute Value and not in Name or Label. To add keywords to a content reference the NAME for the content reference attribute must be KEYWORD and add the search words or phrases separated by commas. 
  
To have the Label and Attribute values to be a translate table click the Translate checkbox. Two distinct tables are used for attributes. One is translatable into other languages while the other is not. For example, PORTAL_HIDE_FROM_NAV = False is used internally by the system and should not be translated however keywords should be translated. 
  
For accurate and high performance searches through the portal, the search engine must reference a comprehensive, up-to-date search index. The search index must be easy to maintain, as content is likely to change frequently within the portal registry.  
  
How the Search Index is built? 

  1. The Application Engine (AE) program - PORTAL_INDEX has to be launched through the process scheduler.
  2. This AE process launches a PeopleSoft C++ program. The C++ program queries the tables in the 'Portal Registry' for search content and it builds two text files. These files are created as .BIF and .DAT which are used by Verity to build its collection.
  3. The AE process then launches the Verity program - MKVDK. The MKVDK program builds the Search index (Verity collection) based on the content in .BIF and .DAT files
Running the PORTAL_INDEX process:
In a busy portal where the content gets frequently changed it is absolutely important to refresh the search index often. Every time the search index is built, the existing search index is overridden. So, generally it is better to schedule this process in batch environment. Alternatively, the process can be also run manually. 
Go to: PeopleTools -> Portal -> Build Registry Search Index
After creating or reusing an existing Run Control ID, Run the process. Before running the process, under Language Options, check the 'All Installed Languages' check box as required.

Searching the Portal after building Search Index:
Type any valid keyword in Search and click 'Go'. The search performs the following steps:

  1. The case of the entered text gets changed to uppercase automatically. This string enables the Verity search engine to search for the text irrespective of case type.
  2. The query string is formatted and passed to the Search API. The formatting includes filtering out hidden content references, expired content references and invalid content references based upon from and to dates.
  3. Calls the Search API which returns the query results
  4. Calls the Portal Registry API. This is done to apply security filtering to the results. Security is applied in PeopleCode by checking the 'Authorized' property.
  5. Formats and displays the search results.
For every content reference returned by the search results page, the following fields are displayed:

  1. Content reference label - a hyperlink which on a click  takes directly to the content reference URL
  2. Long Description of the content reference
  3. Path - breadcrumbs to the content reference

No comments:

Post a Comment