Mathematical Document Retrieval System Based on Signature Hashing

Sourish Dhar
Sudipta Roy


Scientific documents and magazines involve large number of mathematical expressions and formulas alongwith text. The continuous growth of such documents necessitates the requirement of developing specialized tools andtechniques, which could handle and analyse mathematical expressions and formulas. Mathematical expressions andformulae are highly structured and quite different from traditional text. Due to which conventional text retrievalsystem performs poorly in retrieving scientific documents based on mathematical expression formulated as a query.Mathematical information retrieval is concerned with finding information in documents that include mathematics. Toaddress the challenges posed by mathematical formulae as compared to text, this paper aims to construct a mathaware search engine, which can retrieve relevant scientific documents based on a mathematical query. A novelsignature based hashing scheme to index raw mathematical web documents is proposed in this paper, which can alsotake mathematical notational equivalences into account. The proposed system demonstrates better precision andstability of the ranked results when compared with other related state-of-the-art math aware search engines.

Download PDF Cite

Related Journals

Hole Detection and Healing in Hybrid Wireless Sensor Network

Assessment of Fog and Rain Induced-attenuation on Terrestrial FSO Links

Ber Analysis of Swt Based Ofdm on Lte


Search Research and Publications

CARI TULISAN is a scientific publication indexing site that helps everyone find research results and relevant data from papers, journals, books, research reports, and so on. Collected from various repositories, it makes scattered scientific research easily searchable.
All articles and content on this site are copyrighted works of the relevant authors that have been published as a result of scientific research. CARI TULISAN never distributes and supports pirated content.