Free articles
Google
Web www.media13.com


Media13.com is a source of free to republish or reprint articles, writers may submit your articles in our database as long so you agree that your article will be freely republish or reprint by other.


Submit Article

Search our database

Received article from us.

Name

Email Address

Friends

Bollywood News
Link building
Article submission service
Freelance Writers
Laser hair removal

We have been looking for freelance writers. Contact us

 


Search technologies   by Max Maglias



Each of us has been faced with the problem of searching for information more than once. Irregardless of the data source we are using (Internet, file system on our hard drive, data base or a global information system of a big company) the problems can be multiple and include the physical volume of the data base searched, the information being unstructured, different file types and also the complexity of accurately wording the search query. We have already reached the stage when the amount of data on one single PC is comparable to the amount of text data stored in a proper library. And as to the unstructured data flows, in future they are only going to increase, and at a very rapid tempo. If for an average user this might be just a minor misfortune, for a big company absence of control over information can mean significant problems. So the necessity to create search systems and technologies simplifying and accelerating access to the necessary information, originated long ago. Such systems are numerous and moreover not every one of them is based on a unique technology. And the task of choosing the right one depends directly on the specific tasks to be solved in the future. While the demand for the perfect data searching and processing tools is steadily growing let’s consider the state of affairs with the supply side.

Not going deeply into the various peculiarities of the technology, all the searching programs and systems can be divided into three groups. These are: global Internet systems, turnkey business solutions (corporate data searching and processing technologies) and simple phrasal or file search on a local computer. Different directions presumably mean different solutions.

Local search


Everything is clear about search on a local PC. It’s not remarkable for any particular functionality features accept for the choice of file type (media, text etc.) and the search destination. Just enter the name of the searched file (or part of text, for example in the Word format) and that’s it. The speed and result depend fully on the text entered into the query line. There is zero intellectuality in this: simply looking through the available files to define their relevance. This is in its sense explicable: what’s the use of creating a sophisticated system for such uncomplicated needs.

Global search technologies


Matters stand totally different with the search systems operating in the global network. One can’t rely simply on looking through the available data. Huge volume (Yandex for instance can boast the indexing capacity of more than 11 terabyte of data) of the global chaos of unstructured information will make the simple search not only ineffective but also long and labor-consuming. That’s why lately the focus has shifted towards optimizing and improving quality characteristics of search. But the scheme is still very simple (except for the secret innovations of every separate system) - the phrasal search through the indexed data base with proper consideration for morphology and synonyms. Undoubtedly, such an approach works but doesn’t solve the problem completely. Reading dozens of various articles dedicated to improving search with the help of Google or Yandex, one can drive at the conclusion that without knowing the hidden opportunities of these systems finding a relevant document by the query is a matter of more than a minute, and sometimes more than an hour. The problem is that such a realization of search is very dependent on the query word or phrase, entered by the user. The more indistinct the query the worse is the search. This has become an axiom, or dogma, whichever you prefer.


Of course, intelligently using the key functions of the search systems and properly defining the phrase by which the documents and sites are searched, it is possible to get acceptable results. But this would be the result of painstaking mental work and time wasted on looking through irrelevant information with a hope to at least find some clues on how to upgrade the search query. In general, the scheme is the following: enter the phrase, look through several results, making sure that the query was not the right one, enter a new phrase and the stages are repeated till the relevancy of results achieves the highest possible level. But even in that case the chances to find the right document are still few. No average user will voluntary go for the sophistication of “advanced search” (although it is equipped with a number of very useful functions such as the choice of language, file format etc.). The best would be to simply insert the word or phrase and get a ready answer, without particular concern for the means of getting it. Let the horse think – it has a big head. Maybe this is not exactly up to the point, but one of the Google search functions is called “I am feeling lucky!” characterizes very well the existent searching technologies. Nevertheless, the technology works, not ideally and not always justifying the hopes, but if you allow for the complexity of searching through the chaos of Internet data volume, it could be acceptable.

Corporate systems


The third on the list are the turnkey solutions based on the searching technologies. They are meant for serious companies and corporations, possessing really large data bases and staffed with all sorts of information systems and documents. In principle, the technologies themselves can also be used for home needs. For example, a programmer working remotely from the office will make good use of the search to access randomly located on his hard drive program source codes. But these are particulars. The main application of the technology is still solving the problem of quickly and accurately searching through large data volumes and working with various information sources. Such systems usually operate by a very simple scheme (although there are undoubtedly numerous unique methods of indexing and processing queries underneath the surface): phrasal search, with proper consideration for all the stem forms, synonyms etc. which once again leads us to the problem of human resource. When using such technology the user should first word the query phrases which are going to be the search criteria and presumably met in the necessary documents to be retrieved. But there is no guarantee that the user will be able to independently choose or remember the correct phrase and furthermore, that the search by this phrase will be satisfactory.


One more key moment is the speed of processing a query. Of course, when using the whole document instead of a couple of words, the accuracy of search increases manifold. But up to date, such an opportunity has not been used because of the high capacity drain of such a process. The point is that search by words or phrases will not provide us with a highly relevant similarity of results. And the search by phrase equal in its length the whole document consumes much time and computer resources. Here is an example: while processing the query by one word there is no considerable difference in speed: whether it’s 0,1 or 0,001 second is not of crucial importance to the user. But when you take an average size document which contains about 2000 unique words, then the search with consideration for morphology (stem forms) and thesaurus (synonyms), as well as generating a relevant list of results in case of search by key words will take several dozens of minutes (which is unacceptable for a user).

The interim summary


As we can see, currently existing systems and search technologies, although properly functioning, don’t solve the problem of search completely. Where speed is acceptable the relevancy leaves more to be desired. If the search is accurate and adequate, it consumes lots of time and resources. It is of course possible to solve the problem by a very obvious manner – by increasing the computer capacity. But equipping the office with dozens of ultra-fast computers which will continuously process phrasal queries consisting of thousands of unique words, struggling through gigabytes of incoming correspondence, technical literature, final reports and other information is more than irrational and disadvantageous. There is a better way.

The unique similar content search


At present many companies are intensively working on developing full text search. The calculation speeds allow creating technologies that enable queries in different exponents and wide array of supplementary conditions. The experience in creating phrasal search provides these companies with an expertise to further develop and perfect the search technology. In particular, one of the most popular searches is the Google, and namely one of its functions called the “similar pages”. Using this function enables the user to view the pages of maximum similarity in their content to the sample one. Functioning in principle, this function does not yet allow getting relevant results – they are mostly vague and of low relevancy and furthermore, sometimes utilizing this function shows complete absence of similar pages as a result. Most probably, this is the result of the chaotic and unstructured nature of information in the Internet. But once the precedent has been created, the advent of the perfect search without a hitch is just a matter of time.


What concerns the corporate data processing and knowledge retrieval systems, here the matters stand much worse. The functioning (not existing on paper) technologies are very few. And no giant or the so called search technology guru has so far succeeded in creating a real similar content search. Maybe, the reason is that it’s not desperately needed, maybe – too hard to implement. But there is a functioning one though.

SoftInform Search Technology, developed by SoftInform, is the technology of searching for documents similar in their content to the sample. It enables fast and accurate search for documents of similar content in any volume of data. The technology is based on the mathematical model of analyzing the document structure and selecting the words, word combinations and text arrays, which results in forming a list of documents of maximum similarity the sample text abstract with the relevancy percent defined. In contrast to the standard phrasal search by the similar content search there is no need to determine the key words beforehand – the search is conducted through the whole document. The technology works with several sources of information that can be stored both in text files of txt, doc, rtf, pdf, htm, html formats, and the information systems of the most popular data bases (Access, MS SQL, Oracle, as well as any SQL-supporting data bases). It also additionally supports the synonyms and important words functions that enable to carry out a more specific search.


The similar search technology enables to significantly cut time wasted on searching and reviewing the same or very similar documents, diminish the processing time at the stage of entering data into the archive by avoiding the duplicate documents and forming sets of data by a certain subject. Another advantage of the SoftInform technology is that it’s not so sensitive to the computer capacity and allows processing data at a very high speed even on ordinary office computers.


This technology is not just a theoretic development. It has been tested and successfully implemented in a project of giving legal advice via phone, where the speed of information retrieval is of crucial importance. And it will undoubtedly be more than useful in any knowledge base, analytical service and support department of any large firm. Universality and effectiveness of the SoftInform Search Technology allows solving a wide spectrum of problems, arising while processing information. These include the fuzziness of information (at the document entering stage it is possible to immediately define whether such a document already belongs to the data base or not) and the similarity analysis of the documents which are already entered into the data base, and the search for semantically similar documents which saves time spent on selecting the appropriate key words and viewing the irrelevant documents.

Perspectives


Besides its primary assignment (fast and high quality search for information in huge volume such as texts, archives, data bases) an Internet direction could also be defined. For example, it is possible to work out an expert system to process incoming correspondence and news which will become an important tool for analysts from different companies. Mainly, this will be possible due to the unique similar content search technology, absent from any of the existent systems so far except for the SearchInform. The problem of spamming search engines with the so called doorways (hidden pages with key words redirecting to the site’s main pages and used to increase the page rating with the search engines) and the e-mail spam problem (a more intellectual analysis would ensure higher level of security) would also be solved with the help of this technology. But the most interesting perspective of the SoftInform Search technology is creating a new Internet search engine, the main competitive advantage of which would be ability to search not just by key words, but also for similar web pages, which will add to the flexibility of search making it more comfortable and efficient.

To draw a conclusion, it could be stated with confidence that the future belongs to the full text search technologies, both in the Internet and the corporate search systems. Unlimited development potential, adequacy of the results and processing speed of any size of query make this technology much more comfortable and in high demand. SoftInform Search technology might not be the pioneer, but it’s a functioning, stable and unique one with no existent analogues (which can be proved by the active Eurasian patent). To my mind, even with the help of the “similar search” it will be difficult to find a similar technology.

Author Info:

None


Latest 20 articles

Why Should I Hire a Seattle Real Estate Attorney? In every real estate transaction there are a wide variety of legal issues that must be taken care of. Contracts should always be reviewed by an attorney who understands the nuances of real estate law. But there are also state specific State laws to contend with. A Seattle real estate lawyer deals with a large number of State legal issues related to acquiring, financing, developing, managing, co

California IT Professionals – In Need of California Class Action Attorneys! It is hard to miss the wave of California wage and hour litigation related to California IT professionals and Computer employee overtime that has been sweeping California recently. Even the largest software and computer companies have paid out millions of dollars to employees wrongfully classified as “exempt”, or in other words, not entitled to overtime. A layperson would believe that with pr

Protect Your Mental Health With A Payday Loan Just when you thought that things couldn’t get any worse, more banks went under. More countries are declaring economic crisis. It seems that the world’s financial problems are bound to get worse. Even if you do not have considerable investments in various banks, I am sure that you are feeling the stress that results from these economic upheavals. I sure am. I am just an average person, with a

Free Bulk SMS services People are getting very busy these days without having enough time to talk unless it is very important. Mobile SMS have supported this need to contemporary times as now people do not want to call and enter in the unproductive conversation before coming up to the actual issue. SMS provides you the benefit to come straight to the point and also saves money. What if you get this totally free?

HELP – I Have IRS Levy Problems! If IRS collection notices are ignored, the IRS is forced to collect from taxpayers by force. They do this with their dreaded IRS levy. By law, the IRS has the right to levy bank accounts (IRS bank levy), garnish your paychecks (IRS wage levy), or even seize your assets. But you do not have to let the IRS bully you or your family. There are ways to stop an IRS levy. The first step is to know the

Why Use Professional Tax Debt Help? Good IRS tax relief is hard to find and for people with complicated tax issues, it is an absolute necessity. But how do you know which tax reduction firm is the best one to try? How can you be sure you're going to get the tax debt help that you need? And why is it so vital to receive help from professionals? Settling Your IRS Tax Debt For Less When your IRS tax debt is spiraling out

IRS Settlement – Can You Really Settle Your IRS Tax Debt for Less? It's possible to settle your IRS tax debt, but it presents a challenge. Proving you can settle your tax debt for less is a daunting experience. You have to contend with pages of IRS paperwork rife with technical terms. Settling tax debt is indeed a reality and it can be done. However, there's a lot you need to know before you attempt to settle your IRS tax debt. Rebuking the Lies - The

IRS Tax Relief – The Most Popular IRS Tax Relief Solutions Tax law provides many solutions for resolving tax debt. But if you were to contact the IRS directly, they would only alert you to one solution, and that's paying the tax debt in full. Here are five popular IRS tax relief solutions you should know about to be more informed. IRS Tax Settlement It is possible to settle your IRS tax debt. But there are some pitfalls you need to know. Fi

Hire the California Labor Board or California Labor Law Attorneys - You Decide With a downturn in the economy, many employers are cutting back on payroll. Unfortunately, some employers are reducing payroll costs by violating the California overtime laws. When this occurs, employees have essentially two options to recover their California overtime pay: the California labor board, or hiring California labor law attorneys. Although the California Labor Board is a commonly

Hiring a Car At Bangalore - Is More of Necessity Than A Luxury Your sedan is probably the best materialistic thing that you own and while going to a new place, that is what you ought to miss the most but if you are going to Bangalore then you probably have a option of getting a replacement so that you don’t miss it all that most because it will be as comfortable and as obedient as your own car and the Bangalor

How to Choose a Car Rental Company in Bangalore Bangalore is a place that lists one of the best as far as infrastructure is considered in India as it is the IT hub of India and thus the roads to drive is atmost pleasure and thus if Bangalore is the destination that you are planning for next then the best way to move around is by hiring a car. To hire is car is very simple if it is in Bangalore because there are some very good car rental compani

Car Rental Companies in Bangalore If you are travelling to Bangalore and hiring a car is what worrying you then you can be relaxed because this is one service that you will easily get in Bangalore and that too at very affordable and economic prices. Basically there are really good car rental companies in Bangalore available for renting the cars and you can either book in advance or go there and hire a car for yourself. Also, you h

Tips to Hire a Car in Bangalore One thing that most of us agree is that the fact that the most comfortable way to move around at any place is by your own vehicle but what if you are going to a new place, it is not possible to carry your vehicle along with you and you actually cannot trust too much on the public transportation system and at such situation the best possible option is to hire a car for yourself, at some places it c

Hire a Car: Comfortable Way to explore Bangalore If you are out to a new place than the first thing that comes into your mind is a comfortable accommodation and then the second one on the priority list is a comfortable way to move around in the city and if Bangalore is your preferred location, than you’ll be having options for both of the facilities, but if we specifically speak about the second one that is moving comfortably than the best opti

Car Rental – A convenient mode of transport in Bangalore Bangalore, a place which probably is considered one of the best places in India as far as infrastructure is considered and reasons are many for that, one is because of the IT hub that it is and also because it is one of very good places if you are considering holidays and because of this very good facility provided for the people visiting Bangalore is that you have many options in case if you are

Rent a Car in Bangalore: Take the Right Decision In today’s world, time is money; the more you save it better it is for you. And in that case if you are going to another place where you are not too familiar with the surroundings, then instead of wasting time on travelling by locating and travelling by public transport is to hire a vehicle for yourself. This helps in many ways, one it saves on time and also it moves at your pace so you don’t have

Bangalore Car Rental: Save Your Time When you are at an unknown place or for that matter known place and time is your major factor of concern then, the best way to save on time is by having your own vehicle for conveyance. This gives you freedom to move at your own pace and wish. And for that matter if you are at a place like Bangalore where the roads are so beautiful that it is a sheer pleasure to travel through your own rented vehi

Best Package Tours to Kerala Kerala – God’s own country which means heaven for those who appreciate the real untouched, naďve beauty of nature and if you are one of those than consider Kerala as the option for you next holiday. Kerala gives an option to explore the nature to the fullest because it has all the places where you can explore the nature be it scenic beaches or the hill stations or the ethnic culture, just everythi

Kerala Tour: A Getaway from Regular Stress If you are tired of daily stress and meetings and looking for a getaway so that you snatch some time for yourself with your loved one’s than Kerala holidays is the perfect is the perfect thing for you. Its calmness and closeness to nature would give you that much needed peace of mind that you always wanted. Also along with enjoying the beauty of this scenic place on your holidays to Kerala you can

Kerala Tour Packages: Have A Look If holidays are on your mind, then Kerala is one option that you cannot choose to neglect. Because Kerala is one place that offers you all from calm and serene natural places to exciting sports to exotic beaches and also beautiful hill stations, everything provided at one place which is also known as god’s own country. And obviously such wide range of activities would make it a tough job to cover

Categories
Acne
Advertising
Advice
Aerobics-Cardio
Affiliate Programs
Alternative
Arts
Attraction
Auctions
Audio-Streaming
Autos
Awards
Babies-Toddler
Beauty
Blogging-RSS
Book-Marketing
Branding
Breast-Cervical-Ovarian-Cancer
Broadband-Internet
Build-Muscle
Business
Cancer
Careers-Employment
Casino-Gambling
CGI
Coaching
Coffee
College-University
Colon-Rectal-Cancer
Communications
Computers
Cooking-Tips
Copywriting
Crafts-Hobbies
Creativity
Credit
Cruising-Sailing
CSS
Currency-Trading
Customer-Service
Dating
Debt-Consolidation
Debt-Relief
DHTML
Diabetes
Direct Mail
Divorce
Domain Names
EBooks
ECommerce
Education
Elder-Care
Email-Marketing
Email Entertainment
Entrepreneurialism
Environment
Exercise
Ezine-Marketing
Ezine-Publishing
Family
Finance
Fishing
Fitness
Food
Free
Games
Gardening
Goal-Setting
Golf
Government
Grief-Loss
Hair-Loss
Happiness
Hardware
Health
Hobbies
Holidays
Home-Security
Homes
Home Business
Home Repair
HTML
Humanities
Humor
Innovation
Inspirational
Interior-Decorating
Internet-Marketing
Javascript
Kids And Teens
Landscaping-Gardening
Law
Leadership
Leases-Leasing
Legal
Leukemia
Link Popularity
Loans
Lung-Cancer
Lymphoma-Cancer
Management
Marketing
Marriage-Wedding
Martial-Arts
Medicine
Meditation
Men's-Issues
Metaphysical
MLM
Mobile-Cell-Phone
Mortgage-Refinance
Motivational
Multimedia
Music
Negotiation
Network-Marketing
Networking
News-and-Society
Newsletters
Nutrition
Off-Line Promotion
Online Business
Online Promotion
Organizing
Other
Outdoors
Page Rank
Parenting
Personal-Tech
Pets
Photography
Podcasting
Poetry
Politics
Positive-Attitude
PPC-Advertising
Presentation
Prostate-Cancer
Psychology
Public-Speaking
Publishing
Real-Estate
Recipes
Recreation
Reference
Relationships
Religion
Sales
Sales-Management
Sales-Teleselling
Sales-Training
Satellite-Radio
Satellite-TV
Scams
Science
Security
Self Help
Self Improvement
Sexuality
SE Optimization SE Positioning
SE Tactics
Shopping
Site-Promotion
Site Security
Skin-Cancer
Small-Business
Social Issues
Society
Software
Spam
Spirituality
Sports
Stocks-Mutual-Funds
Strategic-Planning
Stress-Management
Structured-Settlements
Success
Supplements
Taxes
Team-Building
Technology
Teleseminars
Time-Management
Traffic-Building
Traffic Analysis
Travel
Uterine-Cancer
Vacation-Rentals
Video-Conferencing
Video-Streaming
Viral Marketing
VOIP
Web-Development
Webmasters
Web Design
Web Hosting
Weight Loss
Wine-Spirits
Women
Writing
Yoga

Copyright © media13.Com  2005. All Rights Reserved.

Sources : SEO India - Internet web directory