Problem. A folder (and/or subfolders) contains hundreds of individually created html or htm or php pages that are not interlinked together in any way. Therefore, a site map cannot be built and even if it was, the pages do not link to each other so will not be indexed properly by search engines. The files are all 'orphaned' .
Required Solution. Preferably a desktop application, that can automatically open each page and add links on it to the other pages, thus linking all pages together. Then a sitemap in both html and xml is created.
These hyper links created on the pages must each contain a 'title' tag, the content for which is derived from the '<title>' text on the page to which the link is pointing. They are 'absolute' urls.
Ie; <a href="[url removed, login to view]" target="_self" title="Complete line of text derived from the <title> tag on [url removed, login to view]"> this text is truncate to first X words of 'title' (can be configured by user from 1 to 10 words) </a>
These links would be written to the pages in a user defined area. This best achieved by using a search string box.
IE; place the list of links in each page AFTER for example;
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<td width="211" nowrap="nowrap" valign="top">
place the list of links in each page BEFORE for example;
NOTE: To ensure accurate placement, the search string could sometimes be quite long.
So a start and end point option would also be required.
IE; insert list of links between <td width="211" nowrap="nowrap" valign="top">
An additional user entry box would be required.
We may want to give the list of links a font or other attribute, so will need an option to enter a 'before' and 'after' string.
IE; <div align="center">
<font size="1" color="#4F76FF">
(List of Links)
Each link written to the page would have a <br> separator as opposed to <p> and begin with a Capital letter unless its a number or some other symbol.
Now the more difficult mathematical part where two things must be considered.
1. Matrixing; With large numbers of pages it would be impracticable and downright ugly to add links to all other pages on each page. So the links must be matrixed, for example;
Take a simple site with 100 pages in one directory. Instead of placing 99 links on each page to the other pages, we could place say just 10 links on each page and matrixing in such a way that all 100 pages are interlinked either directly or indirectly to each other.
2. If the sites directory structure is deep;
Unless linking is done correctly, main pages lose SEO page ranking which is bad. This is especially true where the site has deeper directory levels. IE; [url removed, login to view]
The industry standard for good page rank linking in deep directory structure is as follows;
* Make sure that your primary page(s), the [url removed, login to view] page, links to
your secondary pages or secondary levels.
* Make sure that your secondary pages link to each other and back to the primary [url removed, login to view] page.
* Link your secondary pages to the third level pages within their
sub-directory, sub-domain, or level
* Link the third level pages within each specific sub-directory or
sub-domain to each other.
* Link the third level pages back to the secondary page that it was
* If there are fourth level pages, follow the same linking structure
that has been laid out above.
Thats the project. It can be written in any language providing its stable across any Windows desktop. Speed is not the essence, reliability and perfection of operation is. Once the parameters are set, the program should automatically perform the task.
A desktop application is preferred, though if you think it can be better achieved server side, thats ok providing there is a user friendly front end.