Advanced features by topic

Find and repair broken links / URL history

For most types of websites within WebHare you can use the linkchecker which gives you a list of links on your website that don't work anymore. The linkchecker works like a search engine, wherefore it doesn't provide real time results. Changes made after about 48 hours will be recognized by the linkchecker. In the Excel-download (which is explained hereafter) you can find one column that shows the last time, that this link has been checked.

To prevent broken links WebHare uses the URL history feature (see last paragraph of this page). Whenever a visitor would see a 'page not found' message, WebHare knows what file has been publised on that URL in the past and will automatically redirect the visitor to that page to prevent broken links. Users may add URL's to the URL history manually, for example when they delete files. 

You can use one of the two options from the Publisher in WebHare:

  1. Per text page: The last tab register within the properties panel of any text file shows which links on this specific page are broken. You can look up and change these links of the page.
  2. A complete report for the whole website: In the left screen, select the world symbol to select the whole website and choose in the menu on the upper right à Sites à Link check.

The linkchecker report for the whole site

If you have chosen for the second option, WebHare displays a list with all the broken links:

The different columns:

  • Column 1/file: shows which file/page of your website contains a broken link.
  • Column 2/URL: shows which link it is, that is broken
  • Column 3/Link text: shows the specific word that includes the broken link, so that you can find it more easily on your page.
  • Column 4/Status: shows (when hovering over with the cursor), why the system 'thinks' the link is broken. When a link is broken, the link checker reports ‘Page not found/404’. There are other messages, such as ‘Inloggen/401’ in case a login is required, ‘No access/403’ when the link redirects to an intranet that WebHare can't access. Advice: fix the ‘Page not found/404’-messages and swiftly check for any other things that seem strange to you. Most of the time you don't have to fix the rest.

You have two options: Either to work directly from the screen or to download an Excel-file,

Option 1: Working on the screen of the link checker

In case the list is not extremely long, you are mostly best advised to work on the screen. Double-click one row on the screen to go to the related folder in WebHare. Here you can open the file (WebHare text-editor) or download (Word or other file) to edit the link.

OPTIon 2: Download list as Excel File

Besides this you can export the list into an Excel file (or .csv) Use the button on the top left on the screen (CSV Export) to do so and choose either Excel or .csv in the next step.

The Excel file contains some additional columns, such as:

  • On which day has the related link been controlled (date of the last three days)
  • On which date has the file been created/edited
  • Etc

For a good overview, we advise the following:

  1. Delete some of the columns, so that you are left with only a few:
    1. URL (the links that doesn't work anymore)
    2. File URL (the file in WebHare that contains the broken link)
    3. Status code
  2. Sort the column 'status code’ alphabetically from A to Z by selecting the column (click on gray beam above the column). Then, in Excel, click op Sort&Filter A-Z (see screenshot hereafter). Finally, chose A-Z low to high. On the next screen click OK and the column is alphabetically sorted.
  3. You only have to keep the 404-lines. The other lines can be discarded (maybe shortly check if something appears strange to you, that could possibly cause the error)
  4. Now that you are left with only the 404-messages, you can best sort them by file URL. Select the column file URL (by clicking on the gray beam above the column) and sort by A-Z once again. By doing so, you make sure that all broken links on a specific page are presented after each other, so that you don't have to search for the same file multiple times. Now you have all broken links on one page right below each other.
  5. Go through the file URL list from top to bottom and see what's broken.
    1. Search for the file in the Publisher in WebHare: Open the folders step by step to find the folder you are looking for.
      Hint: Select and copy (CTRL C) the first URL, go to the Publisher in WebHare and search for the file using CTRL SHIFT G (this function is also available in the menu on the upper right hand side. Chose Menu, Go To/Site or URL. [CTRL SHIFT G only works if you have just opened the Publisher or have selected a random file in the Publisher.]
    2. Open (WebHare text editor) or download the file (Word or other files on your computer) to search for and edit the link (as found in the first column of your Excel-sheet).

In the column 'status' in the Excel-file you can find a code. If this code consists of only one number (e.g.: 0; 3; 7) then you can ignore the error message. The codes 100 and higher are standard HTTP status codes. When no HTTP-connection could be established while checking a link, there will also be no HTTP-statuscode. Also for other error situations (e.g.: a page that redirects to itself) you could get no HTTP-statuscode. The following extra codes are being used:

#

Meaning

Explanation

001

Socket error

Error creating TCP socket

002

Server not found

The server name could not be resolved

003

Could not connect

No connection to the server could be made

004

Connection timed out

Making a connection to the server timed out

005

No HTTP connection

Could not open HTTP connection

006

No secure connection

Connection could not be secured (SSL connection)

007

Request could not be sent

The HTTP request could not be sent to the server

008

No HTTP response

No valid HTTP response received from the server

009

Circular redirection

Got a circular redirection

010

Too many redirections

Redirection chain is too long

Important HTTP-statuscodes are:

#

Betekenis

Uitleg

400

Bad request 

Manually check whether the link is still correct.

403

No access

Mostly because of the intranet; the linkchecker has no access. Manually check whether the link still works.

404

Page not found

The link is broken and needs to be replaced. >>>  MOST IMPORTANT FOR USERS.

405

Method not allowed

Manually check whether the link still works. 

500

Server error

Can't be fixed. Thereis a problem with the website, to which is linked.

URL HISTORY

The URL history helps prevent broken links and has two functions:

Function 1: redirect visitor automatically

Do you ever move files? Or do you give them a different name? In both cases, they get another URL / link. The URL history prevents the user from seeing a "page not found" notification. Each time the visitor would get a "page not found" message, WebHare quickly finds out which file has ever been on this URL. And if that file still exists, the user will automatically be redirected there.

Function 2: Manually view and adjust the URL history

The user can also view the URL history and manually add URLs to it. This is a nice feature when deleting pages. Instead of returning a "page not found' error, you can add the URL of the deleted page to another page that still exists and therefore forward the user to that file.

View URL History:

  1. Select a file that has a URL, such as an editor file, a PDF, a Word document, etc.
  2. Go to the menu (three bars) in the right upper corner of the Publisher and open the URL History via Edit > URL History

The URL history for that particular file will be displayed:

In this specific case, the website has ever been published on www.utwente.nl/academischeplechtigheden/en/... and has been updated to www.utwente.nl/en/academic-ceremonies/.... . This particular pdf file has also had different names. All of these URLs will continue to work, if they are listed in the URL history.

Adding URL History

  1. Find the URL of the URL that no longer works. This may be a notification of a broken link you received by email from someone or if you delete a file yourself.
  2. Copy this URL (CTRL + C).
  3. Open the URL history as described above
  4. Click Add on the URL History screen and paste the URL into it (CTRL + V).

Please note: 

  • You can only add URLs to the URL history if you have permissions for the particular website. If you manage the website www.utwente.nl/en/academic-ceremonies/, you can only add URLs that start with this path. If you want to add another URL to your URL history, the administrators can do that for you.
  • On the homepage of a website, a shortened URL is often linked to the URL history. Therefore, it is important that the file is not deleted, otherwise the URL history of that file would be gone. To avoid this, the index file of the website is generally protected (a pin icon) so it can not be deleted.