Sophisticated relevance ranking algorithms bring the most relevant items to the top of a search results list.
“Forced Rankings” System
The first items in a relevance-ranked list can be predetermined based on the presence or absence of certain keywords in the query.
Full-text search “hits” are highlighted.
Smart document summaries show query terms highlighted in context. Summary length and layout is customizable.
Full Query Language
A query language that supports both full-text and parametric (structured) searches in the same query, using simple and powerful syntax. Full support for boolean AND, OR, and NOT queries, fielded searches, comparisons (=, >, >=, <, <=, <>), parentheses, exact phrases, wildcards (*), and regular expressions.
Users receive suggestions for possibly misspelled words in a search query. For example, a user typing the word “recieve” will see the message “Did you mean receive?”. The system only suggests words that are actually in the index.
The system will optionally include items that contain inflectional forms of search terms. For example, a search for “drill” will also pick up items that include “drills”, “drilled”, “drilling”, etc.
Thesaurus for Synonyms
An optional thesaurus will find synonyms for search terms. For example, a search for “red” can also find “crimson”, “scarlet”, etc.
Relevance Weighting by Field
Words that appear in one field can receive a greater weight in the relevance ranking than if they appear in another field. For example, products that have the word “drill” in the name can appear higher in the search results than products that have the word “drill” buried in a product description.
Support for dozens of languages, including Asian languages.
Common words like “and”, “of”, “the”, etc. (sometimes called “noise” words) can optionally be ignored, reducing the size of the index and increasing search speed.
Accents / Diacriticals
Words with accents can be found, even if the user does not enter the accents correctly. A search for “pate” will find “pâté” and vice-versa.
Dynamic Indexes, Real-Time Searchability
Items can be added to the index and found immediately. There is no need for mass periodic indexing.
Search and Navigation
Users can navigate data sets using intuitive, dynamically-generated menus. Menus are generated from the underlying document attributes or metadata. Menus give users context-dependent browse capability, allowing them to see what options are available to them at each step.
Unique Value Lists
It is possible to generate the list of all unique values in a field/attribute for a given search result, along with the number of items that have that value. For example, after a search for “Bob” in a list of names and addresses, the system can show a list of associated cities ["Smithberg (10), Jonesville (8), Leeberg (5)"]. Users can narrow or modify a search using these values.
A new taxonomy data type makes it a snap to build dynamic, hierarchical menus for users to browse data sets.
User Interface Classes New!
A new set of classes make it possible to build faceted navigation interfaces with a minimum of coding. New classes support lists, trees, dropdowns, and listboxes. New support for dynamic ranges for high-cardinality data.
Query Display and Modification New!
A new user interface class displays queries and allows users to refine or expand them intuitively.
Special Query Features
Search Within Fields
Searches can be confined to an individual field/attribute. For example, a user can search for “Joe” within “author”.
Exact Match Filtering
Fields can be matched exactly. For example, city=”Anytown”, author=”John Smith”
Items can be viewed, compared, sorted, and filtered in a convenient table format, even if they have different sets of attributes. Missing attribute values are handled gracefully. This is especially useful for product catalog applications. See the product catalog demo for an example.
Fields (attributes) can be marked as hidden or visible by default. Users have the option of viewing only the fields they wish.
Items can be sorted in a result set by the value of any field or attribute. Most search engines sort items only by relevance.
Fields can have multiple values. For example, a drill might have a “use” attribute with values “masonry, wood, concrete”. A search for a drill with use=”concrete” would select appropriate drills.
Full support for indexing XML in any format. No need to predefine the structure — tags are added and made searchable on the fly.
XML hierarchical tag structure is preserved. The query language supports searching by both tag and by XML path.
Can break XML files into separate “documents” based on any tag.
Summaries that show query terms highlighted in context can be shown, even across different sections of an XML document.
Can retrieve all or part of an XML document.
ECCMA & Dublin Core
Support for special XML formats, include ECCMA and Dublin Core.
Application Development and Administration
Create, edit, and delete indexes using a menu-driven administrative application.
Define Data Connections Visually
Connect to data sources, including websites, directories of files, and databases by selecting menu options.
Control Indexing Process
Start, stop, and monitor indexing visually.
Perform ad-hoc searching of indexes with new search interface. No need to create a custom web page. Eases development by making testing convenient.
Add and modify attributes with point and click. Set searchability and other characteristics of attributes.
View all unique values for an attribute along with the number of hits. Makes development of navigational interfaces easy. Makes it faster to find errors in the data.
Edit the thesaurus to manage your own list of synonyms.
Index Properties Editor
Manage index options with point and click. No need to edit configuration files.
Comes bundled with a built-in high-performance web server and servlet container. Provides a complete environment for deploying production applications. There is no longer a need to purchase an expensive server, or struggle with implementing an inexpensive alternative. Simple installation.
Start/stop the server, manage license keys, and view server statistics.
Crawls websites and directories of files. Multi-threaded for high-speed indexing. Numerous options for controlling the crawling process.
“Autoindexer” module supports scheduled crawls and scans of sites, directories, and databases.
Include or exclude web pages or files using URL wildcards.
Provide username/password for crawling restricted sites
Optionally strip HTML formatting from text in SQL databases and flat files.
Item Preprocessor New!
The system includes an interface for preprocessing items before they’re added to the index. Items can be modified, assigned to categories, rejected, or changed in arbitrary ways as part of the normal indexing process.
Document and Database Support
HTML, PDF, XML, and Microsoft Office Formats
Indexes documents in a variety of formats. Automatically extracts document metadata for more sophisticated searching.
Indexes user-defined data in any form via API calls. searching.
Flat Database Files
Imports and indexes flat files in comma-delimited, tab-delimited, and other formats.
SQL Databases, Including Oracle, MS SQL, MySQL, DB/2, Others
Indexes tables in any SQL database that provides a JDBC driver. All major and most minor DBMS vendors provide JDBC drivers.
No Predefined Schema Required.
Any document, XML file, or database record can be indexed without first defining its structure. For example, if a new product is added to the index, and that product has some new attributes (say, length, color, etc.), these attributes will be automatically added to the index and will become searchable, sortable, and filterable. This makes it possible to index items of completely different structures without administrative overhead.
Smart Titles for PDF New!
The system uses advanced heuristics to create good titles for PDF documents when titles don’t already exist.
Support for 40 languages and 140 dialects, including Asian languages.
Advanced word-breaking technology accurately identifies indexable text, even for languages that do not put spaces between words.
Linguistic stemmers provided for Danish, Dutch, English, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish.
Stop words lists provided for a dozen languages.
All text is stored as compressed Unicode, providing support for any character set.
Will accurately convert native character sets into Unicode for most document types.
Detects character set encoding used in HTML documents.
Written Entirely in Java
Not just a Java API — the entire product was written from the ground up in Java. Fully portable to any operating system that supports Java.
Memory and disk space usage is configurable. Unlike many search engines, does not require a dedicated server for OEM applications.
The core search engine is easily embedded in other Java applications. It has a wide-open API and no external dependencies.
Output as JDBC ResultSets or XML
Search results returned via the familiar JDBC API or as XML.
Items added to the index are immediately searchable. No need to wait for lengthy batch updates or merges.
A logging module tracks all searches, providing a means of analyzing user needs.
Any errors that occur are logged for diagnostic purposes.
Dieselpoint Search ships with a production-quality, pre-built, search-enabled product catalog application. Not just a demo, this application can power your business. Source code provided.