AbstractA Swiss software editor is looking for a tool able to categorize documents, extract topics and offer additional semantic features. The technology sought is ideally already on/near the market stage, but development stage technologies matching the request may also be considered. The preferred commercial agreement is an OEM partnership including technical assistance and service.
DetailsA software editor in Records and Document Management is looking for a component (or a set of components) that would provide the following features:
• Categorize a document into a set of dynamic categories (taxonomy of 500-2000 elements)
• Extract “topics” from a document
• Identify from a list the set of documents that are “semantically close” (in terms of proximity) to a given input document
• Identify from a list the sets of semantic clusters
• Identify (and optionally extract) semantic meta-data from documents
Following specifications are required:
The component(s) must provide clearly defined APIs (Web Services or Java preferred) to guarantee a simple and powerful level of integration.
PACKAGING AND DEPLOYMENT
Different options can be considered as long as the tool is self-contained and provided as one product. Whether it’s then packaged as a set of library, an appliance or other is a priori acceptable.
The objective is purely functional so it doesn’t matter whether the tool relies on artificial intelligence, language processing, ontology or other. It is important nonetheless that the Swiss company fully understand the implications of implementing a given technology internally or at a client site.
The tool must provide a very high level of performance both in terms of indexation and search, as it may need to process hundreds of thousands of documents per day. Performance figures will be required as part of the (eventual) evaluation.
Adaptability to languages and special characters such as Cyrillic, Arabic, Hebrew, Chinese, etc. is a plus but may also be considered in further development.
Ideally, this component would already be at a production stage and installed at a few customer sites. Pre-production components will also be considered if there is a strong match with the functional requirements.
The agreement would be an OEM partnership and it would also include, in addition to the product itself (royalties on sales), the level 2 and 3, maintenance and (optionally) a service/training partnership.
Technical Specifications / Specific technical requirements:
- Clearly defined API’s (Web service or Java preferred)
- Self-contained package (library or appliance)
- High performance in term of indexation and search
- Adaptability in term of languages and alphabet