Hochschule RheinMain

Automated Reasoning for Web Usability Problems

The goal of our project is to detect usability problems within web sites automatically. Experts working on improving usability should be lead to the problems areas instead of having to analyze an entire web site.

The advantage of web applications is that they are implemented on a server of the service provider. Therefore, it is possible to collect a lot of usage information without directly observing the users of the application. Additional usage information can be collected with special support by the web 2.0 application. Among others these are focus events, scroll positions, and content changes. Most importantly, the paths users follow are of interest. Particularly, those paths that several users take are important for the analysis of usability. They can provide information about which parts of a web 2.0 application are used frequently, rarely, or not at all. It can also be determined how long the user remains on a page. If a user stays on a page only for just a short time the offered information might be too small. If he or she stays very long and if the information provided is not extensive, the web application might be too difficult to use. If many users leave a web application often exactly at the same point without completing a task (e.g., a purchase), something might be wrong with this web page and should be investigated.

First, the user activity needs to be collected. This is done by an AJAX script which logs user activities. After this, the huge amount of log data needs to be processed to extract useful information. Only content such as web pages, content changes, focus events, etc. are of interest for usage analysis. Next, all log entries are sorted pertaining to the different users. This is done by session tracking. The produced data can be filtered again. During an analysis it might be interesting just to regard selected activities, e.g., only ones within a time range or related to a special circumstance. Also adequate visualizations of the user activity are provided.

To determine usability additional information need to be gathered on a web 2.0 application. To determine the complexity of a web page it is necessary to calculate the quantity of syllables. To do this it is fist necessary to reduce all non visible text. Web 2.0 applications contains also navigation menus and other words which do not belong to the main content. To solve this problem, content extraction is used. Another measurement for the complexity is the readability. Readability is commonly evaluated by using readability indices, such as SMOG. The subject of a text is also of interest. Subject classification aims at finding out what a text is about. Possible subjects might be e.g. biology or economy. In our case it is also possible to define whether a test is a general text or a technical text.

Another measurement is the number and type of input elements on a web page. If there are input elements a user needs more time to handle a web page. The amount of input elements is significant for the required time. The visual complexity can specify the quality of a design.

To obtain the measurements RESTful web services are provided. Feel free to use these web services.

The project contains two components: an analyzer and a crawler. The analyzer tracks user related data. The crawler collects information including page structure, types of content, text readability, etc. Both data are coded in XML. To work with reasoning algorithms it is necessary to convert these XML data to an ontology. This is done by the reporting & ontology production component. This component calculates reference values out of the measurements related to every individual web page and provides these values in an ontology. In addition rules for possible usability problems need to be defined. Using these rules a reasoner can detect usability problems. These problems need to be visualized properly.

Project Publications