Developing a Text Information Retrieval System "project for college"
$30-250 USD
Cancelled
Posted about 9 years ago
$30-250 USD
Paid on delivery
Introduction
Information retrieval is the process of extracting useful information from data. In the
current era, text constitutes an important form of data. This includes web pages, emails,
SMS messages and several other text documents types.
Text documents need to be represented in an appropriate format (usually in the form
of vectors of numbers) in order to be used for further processing. Once properly repre-
sented, text documents can be used for various tasks such as classication, for instance,
deciding whether an email is a spam, or search, for example, deciding whether two web
pages have similar content.
Before representing documents as numbers, however, they must be preprocessed. Text
preprocessing is the tasks of removing unnecessary information from the text. This is
achieved through several steps, which are summarized hereafter
1. Initial preprocessing: The goal of this step is to "clean up" the document and
prepare it for the remaining tasks. The dierent tasks conducted in this step are:
(a) Replace tabulation, return and new line by space.
(b) Remove all non-letter characters: turn punctuation, numbers, etc. into spaces.
(c) Switch all letters to lowercase.
(d) Substitute multiple spaces by a single space.
(e) Remove words that are shorter than 3 characters long. For example, remove
"an" but keep "him".
2. Stop words removal: Some words such as "a", "the", "and" are very common in
English and should be removed from the text in order to only leave useful words.
This task is simply done by removing any word that appears in a predened list of
stop words.
3. Stemming: The same word can take dierent forms depending on its role and
position in the sentence.
Hello
I am Java expert and interested in this project. I have reviewed your requirements and confident to handle this project perfectly.
Please communicate to discuss further.
Regards
Anshu
Im a java specailist like to do this for you...pls send me more details of the specificaiton..it seems not complete you can contact me through mail skype gtalk pererabdi
Hi,
I'd be happy to help you with your project. I'm a Mechatronics Engineer who specialises in software development for the high-tech industry, so I'd be perfect for your job. Please contact me for more information.
Best regards,
Matthew Meyer
I have experience in designing and implementing Search Engines on various text corpus and I am willing to help you in this project.
By the way, I suggest you to use lemmatization instead of stemming.
Have good skills on java.
Don't Worry your project will be delivered.
I already visualized your project in my mind whether i got this bid or not i will complete this project.
Hi,
I've worked on this kind of project earlier for one of the projects that I made in college. I can deliver this project as per the timeline for the specified amount of fee, assuming that you expect the application as per the description provided.
We can sort out any additional requirements which you might have as we progress.
Regards,
Rahul.