Showing posts with label FAST Command. Show all posts
Showing posts with label FAST Command. Show all posts

Tuesday, May 19, 2009

Handling Query Errors (Java Search API)

Query errors will appear as an exception to the search() method within the ISearchView interface. Instead of printing the full Java exception, it is possible to catch the specific exception with its error code and error message.

Example:

try {
IQueryResult result = engine.(query);
...
...
} catch (SearchEngineException e) {
System.err.println("Error " + e.getMessage() + ": " + e.getErrorCode());
}

A nonzero error code indicates query related error messages.

Wednesday, November 19, 2008

RankLog in FAST ESP

The SBC (Search Business Center) provides a simple way of defining how document summaries are rendered, but only allows for the
fields returned to be used(When the rank log turned on).

To Enable RankLog:

1. Go to Search Profile Settings > Query Handling in SBC.
2. Add the static query parameter ranklog=true and save.
3. Publish the Search Profile by going into Publishing and click on Publish
Search Profile.

Sunday, October 12, 2008

ESP : Error 1005

Find the below log file for this error.It will accours when the QPS exceeds the limit.

Error :

Error 1005 Query Term Refuse

Solution :

Check the QPS license limitations via Admin GUI.If it's exceeds the limit try to get a new license and restart the QR Server.

ESP : Error 28 No space left on device

Find the below log file for this error.It will accours when you did a mistake in Config file.

$FASTSEARCH/var/log/configserver.scrap

It's one of the FATAL error.The main causes for this error is,there is no space available to store the config file in that partition.

Error :

Error saving main configuration
file: IOError: [Errno 28] No space left on device

Solution :

Clear some space to allow the configserver to save configuration.

Note :
Stopping the configserver during these conditions may cause information to be lost.

ESP : FATAL Error 128

Find the below log file for this error.It will accours when you did a mistake in Config file.

$FASTSEARCH/var/log/configserver.scrap

It's one of the FATAL error.The main causes for this error is,FAST ESP could not able to do the character encoding during the load of Cofig file.

Error :

Error loading config file: UnicodeError: ASCII encoding
error: ordinal not in range (128)

Solution :

Edit the configuration file and remove those characters.

ESP : Indexing

The purpose of storing an index is to optimize speed and performance in finding relevant documents for a search query. Without an index, the search engine would scan every document in the corpus, which would require considerable time and computing power. For example, while an index of 10,000 documents can be queried within milliseconds, a sequential scan of every word in 10,000 large documents could take hours. The additional computer storage required to store the index, as well as the considerable increase in the time required for an update to take place, are traded off for the time saved during information retrieval.

Index Design Factors
Major factors in designing a search engine's architecture include:

Merge factors

How data enters the index, or how words or subject features are added to the index during text corpus traversal, and whether multiple indexers can work asynchronously. The indexer must first check whether it is updating old content or adding new content. Traversal typically correlates to the data collection policy. Search engine index merging is similar in concept to the SQL Merge command and other merge algorithms.

Storage techniques
How to store the index data, that is, whether information should be data compressed or filtered.

Index size
How much computer storage is required to support the index.

Lookup speed
How quickly a word can be found in the inverted index. The speed of finding an entry in a data structure, compared with how quickly it can be updated or removed, is a central focus of computer science.

Maintenance
How the index is maintained over time.

Fault tolerance
How important it is for the service to be reliable. Issues include dealing with index corruption, determining whether bad data can be treated in isolation, dealing with bad hardware, partitioning, and schemes such as hash-based or composite partitioning, as well as replication.

Index Data Structures
Search engine architectures vary in the way indexing is performed and in methods of index storage to meet the various design factors. Types of indices include:

Suffix tree

Figuratively structured like a tree, supports linear time lookup. Built by storing the suffixes of words. The suffix tree is a type of trie. Tries support extendable hashing, which is important for search engine indexing.[8] Used for searching for patterns in DNA sequences and clustering. A major drawback is that the storage of a word in the tree may require more storage than storing the word itself. An alternate representation is a suffix array, which is considered to require less virtual memory and supports data compression such as the BWT algorithm.

Tree
An ordered tree data structure that is used to store an associative array where the keys are strings. Regarded as faster than a hash table but less space-efficient.

Inverted index
Stores a list of occurrences of each atomic search criterion[10], typically in the form of a hash table or binary tree.

Citation index
Stores citations or hyperlinks between documents to support citation analysis, a subject of Bibliometrics.

Ngram index
Stores sequences of length of data to support other types of retrieval or text mining.

Term document matrix
Used in latent semantic analysis, stores the occurrences of words in documents in a two-dimensional sparse matrix.

Monday, September 15, 2008

FAST : Integrate the File Traverser

A user friendly interface to the File Traverser can be intgrated into the administrator interface.The connector controller module enables the File Traverser to be integrated into the FAST ESP administrator interface.

To integrate the connector controller, complete the following procedure on each node that you want to make available for file traversing via the administrator interface.

Note :

If it seems from the logs that the file traverser does not start, Check the
connectorcontroller scrap file $FASTSEARCH/var/log/connectorcontroller/connectorcontroller.scrap and the
filetraverser scrap file in $FASTSEARCH/var/log/FileTraverser_.scrap

1. Add the following entries to the $FASTSEARCH/etc/NodeConf.xml file.
a) Add the following to the element:



b) Add the following the list of processes:



2. Execute command: $FASTSEARCH/bin/nctrl reloadcfg
3. Execute command: $FASTSEARCH/bin/nctrl start connectorcontroller

The File Traverser should now appear as a Data Source in the administrator interface.

4. Test a normal situation scenario
a) Add collections with the file traverser as a data source.
b) Start and stop the data source.
c) Delete the collection.

FAST : Disable User Authentication

We can disable user authentication in the FAST ESP administrator interface by completing the steps in this procedure.

1. Open $FASTHOME/etc/guiConfig.php
2. Set the following parameter:
$ADMINGUI_PHP_AUTH_DISABLED=True;

Thursday, September 4, 2008

FAST ESP ; MARSHAL_MessageSizeExceedLimitOnClient

Question:
---------------

Error: MARSHAL_MessageSizeExceedLimitOnClient What can be the reason for it?

Answer:
--------------

Error: MARSHAL_MessageSizeExceedLimitOnClient usually happens by trying to extract
records or attachments beyond a specified limit. Make sure that you have the
OMNIORB_CONFIG environment variable set to point to the omniorb.cfg file. In this file you can look for the property giopMaxMsgSize = 209715200 # 200 MBytes.

The default level I believe is 200MB.

The hanging will happen when you have this misconfigured.

FAST ESP : Check the DocCount for Collection

Question

Is there a way to determine if the index for a collection is completely
empty and deleted.i.e. after adminclient -d AND deleting the collection in the GUI.
How can we know that everything is really gone.

Answer:

On large systems deleting all documents in a collection may take quite
some time. You should verify that all documents in the collection are gone
by issuing doccount-commands to all columns by using the rtsinfo tool.

Usage:

rtsinfo nameserver nameserverport clustername columnid rowid


For a system with three columns, one row and standard port range, run
these three commands on the admin node.

rtsinfo adminhost 16099 webcluster 0 0 doccount collectionname
rtsinfo adminhost 16099 webcluster 1 0 doccount collectionname
rtsinfo adminhost 16099 webcluster 2 0 doccount collectionname
(replace adminhost and collectionname with the entries valid for your system)
Typical output from each of these commands:

There are 1750 docs in the collection collectionname.
SUCCESS.

When "0 docs" is reported from all columns, the collection is clean.

FAST ESP 4.3.x : Delete Indexed Documents

QUESTION:

I have several collections that I would like to re-crawl from scratch, but I don't want to have to reconfigure all the settings for each. In FDS 3.x, is there a way to delete all crawled data without losing the collection configurations?


ANSWER:

Here are the steps required for deleting all crawled data and the index from a 3.2 installation without removing the crawler configuration:

IMPORTANT - This will cause complete loss of all indexed documents,
therefore, search will be unavailable for some time until the crawler has begun re-populating the collections. We strongly recommend initiating this procedure during a system maintenance window.

1. Stop FDS from the Admin GUI or using the command 'net stop FASTDSService'

2. Ensure all FAST processes have had time to stop completely and manually kill any remaining processes with the Task Manager

3. Delete all files and directories within the %FASTSEARCH%\data\directory, EXCEPT %FASTSEARCH%\data\crawler\run\domainspec (this file contains the crawler collection configurations)

4. Start FDS with the command 'net start FASTDSService'

5. Once all FDS processes are active in the System Management page, open up the collection configuration for each collection, verify that the settings are still correct and then click 'submit' on each to refresh the collection information.


NOTES:


-You may see temporary OSErrors for the PostProcessor trying to locate the collections directory (which will be in the process of being rebuilt).

- You may also see temporary errors from the QRServer, such as 'All partitions down', because the index is still being rebuilt.

- Some collections may start immediately crawling, while others may be idle for a short time before they start crawling.

FAST ESP : Term Descriptions

Question :

**********

Do you have a quick reference sheet for the terms associated with indexing and related concepts such as: Search Clusters, Search Columns, Search Rows

ANSWER
=======

This reference is found in the FAST Data Search 3.2 Configuration Guide.

A Data Search installation may consist of a number of Search Engines. A Search Engine provides indexing and search features towards a given partition of the total searchable content. The Search Engines are grouped in Search Clusters, Search Columns and Search Rows.

A Search Cluster is a group of Search Engines that share the same Index Profile (schema). This means that the collections assigned to this cluster may be mapped to the same index layout. One Search Cluster may for instance contain web pages and documents, while another Search Cluster may contain items from a content database.

The cluster may include multiple Search Rows (query rate scaling) and Search Columns (data volume scaling) that share the same index configuration. The matrix in the figure above indicates this.

Each Search Cluster will have a number of Collections assigned to it,which provides a logical grouping of content. Note that the collection concept represents a logical grouping of the content within the Search Cluster (one collection resides inside one
Search Cluster, but may be spread across multiple Search Columns).

The Document Processing is performed prior to indexing. Within the document processing each document is represented by a set of Elements, which can be further processed and later mapped to searchable Fields via the Index Profile. Elements and Fields may represent content parts and attributes related to the document (body,
title, heading, URI, author, category).

The Index Profile defines the layout/schema of the searchable index, and defines how fields are to be treated by query and result processing. Each Search Cluster has an associated Index Profile.

The Index Profile also includes one or more Result Views that defines alternative ways for a query front-end to view the index with respect to queries.

FAST ESP : Duplicate items when searching

Question :
**********

I'm getting a lot of identical hits for the same item. What have I
done wrong?

ANSWER:
*******

There are several possible causes for this, but the most common cause
is that the document ID is not present in the document summary.
Unless you've explicitly disabled incremental indexing, the first
entry in the first document summary class MUST be the document ID. If
not, incremental indexing will not work, and you will get lots of
duplicate items.

FAST ESP - Error code 1102: "Could not open channel to server."

Error code 1102: "Could not open channel to server." in the var/log/qrserver.scrap file.

Description:
--------------

1102 is the error code for "Could not open channel to server.", means that the topfdispatch process the qrserver has been configured to use, is not listening to the transport port.

In such error cases the topfdispatch is most likely down so that all queries issued in the time-period will receive the 1102 error code or in addition you may see the transition error codes listed below.

Transition error may appear when fdispatch goes down and the qrserver loses the
connection.

Typical transition error codes are:

1107: "Connection failed while waiting for query result."
1110: "Connection failed while waiting for document summaries."


Solutions:
---------
Restart of the topfdispatchers which can be done from the Admin GUI --> System
Management.

Such a down/up transition can be caused by a slow system (i.e a ping time out).

You could try to increase the "pingioctimeout" option by updating the file
$FASTSEARCH/etc/config_data/QRServer/webcluster/etc/qrserver/qrserverrc for
instance with:

pingioctimeout = 30000 # 30 seconds

and restarting the qrserver (nctrl stop/start qrserver) process on all nodes that are running qrserver.

To check if a server is running "qrserver" or any process, use the command
"nctrl sysstatus".

Thursday, August 28, 2008

Open a Command Prompt Window From Within Windows Explorer

Follow these steps to enable this option in the right-click drop down menu in

Windows Explorer:

1. Open "Windows Explorer"
2. Tools menu / Folder Options
3. Select File Types tab
4. Find and highlight "Folder" in File Types
5. Click Advanced
6. Click New to add new action
7. Type action name: Command Prompt
8. Type in "Application used to perform the action": cmd /k cd

(It may be necessary to type in the full path to cmd such as C:\WINNT\system32\cmd.exe)

Wednesday, July 30, 2008

FAST : nCtrl

nCtrl is an executable file to control the FAST Server Node.It resides in FAST_HOME/bin.nCtrl gets the data's from the NodeConfig.xml which is resides in FAST_HOME/etc directory

Commands :

nCtrl [Options] [Commands]

Status -> Display Process/Modules details which is available in this Node.
Ex : nCtrl status

Start : - > Start the Process/Modules
Ex : nCtrl start [process1] [process2]...

Stop : - > Stop the Process/Modules
Ex : nCtrl stop [process1] [process2]...

Kill : - > Force to kill the Process/Modules
Ex : nCtrl kill [process1] [process2]...

Suspend : - > Suspend the Process/Modules
Ex : nCtrl suspend [process1] [process2]...

Resume : - > Resume the Process/Modules which is previously suspended
Ex : nCtrl resume [process1] [process2]...

Create : - > Create a new Process/Modules
Ex : nCtrl create [process1]

Reload Config : - > Reload the NodeConfig.xml.
Ex : nCtrl reloadcfg