Publication | Closed Access
A signature file scheme based on multiple organizations for indexing very large text databases
36
Citations
0
References
1990
Year
EngineeringBusiness IntelligenceText MiningInformation RetrievalData ScienceData MiningDatabase SystemDatabase SupportBlock Descriptor FileManagementData IntegrationLarge DatabasesData RetrievalData ManagementSignature File SchemeLibrary DatabaseKnowledge DiscoveryComputer ScienceDatabase TechnologyData SecurityCryptographyData IndexingLarge Text DatabasesMultiple OrganizationsIndexing TechniqueBig Data
A new signature file method for accessing information from large databases containing both formatted and free text data is presented. The new method, called the multiorganizational scheme is proposed for indexing very large databases containing hundreds of thousands or possibly millions of records. With this method, records are grouped into blocks and signatures are formed for each block of records. These signatures are stored in a block descriptor file using a storage device called the bit slice organization. By forming multiple block descriptor files, each based on a possibly different grouping of records into blocks, it is possible to efficiently determine record matches on query. Both computational results based on a mathematical model as well as experimental results using a library database are presented. These results show that the method provides effective access to large text databases. © 1990 John Wiley & Sons, Inc.