As discussed in the previous post “How to Measure User Interface Efficiency“, I stated that it is easy to create a User Experience Design (UXD) or Interaction Design (IxD) interface that can minimize the cognitive and manipulative load in executing a specific task. This interface must be usable in the three most used interaction modes: graphical, voice, and text.
Author: Josef Betancourt
CR Categories: H.5.2 [User Interfaces]: Input devices and strategies — Interaction styles ;
Here is a draft of what I’m thinking about this.
Let’s review the problem. A user desires some action X. To trigger X, there must be one or many sub-steps that supply the information or trigger sub processes so that X can be successful. X can be anything, an ATM transaction, insurance forms on a website, or sharing a web page. Let’s use the last example for a concrete discussion.
On my Android phone (Samsung Galaxy Note) when I am viewing a web page, I can share it by:
- Click the menu button
- View the resulting menu
- Find “Share page”
- Click “Share page”
- Get a menu “Share via”
- Find “Messaging”
- Can’t find it
- Scroll menu down
- Find it
- Click “Messaging”
- Get Message app, ‘Enter recipient’
- Click Contact button
- Get ‘Select a contact’ app
- Click ‘Favorites’ button
- Search for who you want to sent to
- Put check box on contact’s row
- Click ‘Done’ button.
- Get back to Message app
- Click ‘Send’ button
And, that is just a high level view. Note that, of course, systems can use recently used lists or search to reduce the complexity. If you include the decision making going on, the list is much greater. Other phones will have similar task steps, hopefully much shorter, that is not the point. The interaction diagram is shown in figure 1. TODO: show interaction diagram.
This interaction is very quick and easy. The fact that is has so many steps is symptomatic of the user interfaces and has many drawbacks.
- Cognitive load: Despite all warnings and prohibitions, mobile devices will be used in places they should not be, like cars. These task manipulations just make things much worse.
- Effort: All of these tasks eventually add up to a lot of effort. Ok, if this is a social effort, but when part of a job not profitable.
- Accuracy: The more choices the more possibility of error. As modern user interfaces are used in more situations this can be a problem. Does one want to launch a nuke or order lunch?
- Time: These tasks add up to much time.
- Performance: As we do more multitasking (the good kind), these interactions slow down our performance. Computer performance is negligible.
Interacting with computer interfaces is just too complex and manipulative. How can this be made simpler?
In the industry there has been a lot of progress in this area. However, the predominant technique used is the Most Recently Used (MRU) strategy. This is found in task bars, drop down menus, and so forth. Most recently in one Android phone the Share menu item has an image of the last application used to share a resource. The user can click the “share…” and use the subsequent cascading menu or click on the image to reuse that app to share again.
This is an improvement, however, as discussed below, there are further optimizations possible to actually invoking via the selected sharing application.
Use prior actions to determine current possible actions. What could be simpler? In the current scenario, as soon as I select the ‘Share’ option, the system will generate a proposal that is based on historical pass action. Note this is not just “Most Recently Used” strategy, but also based on context. If I am viewing a web page on cooking and click share, most likely I am targeting a subset of my contacts that I have previously shared “cooking” related pages with.
Now I can just switch to that proposal and with one click accomplish my task. If the proposal is not quite what I had in mind, I can click on the aspect or detail that is incorrect, or I can continue with my ongoing task selections, and each successive action will enhance the proposal.
The result is that in best case scenario, the task will be completed in two steps versus twenty. A 90% improvement. In worse case, the user can continue with the task as usual or modify the proposal. But, the next time the same task is begun, the generated proposal will be more accurate.
What does a proposal look like? Dependent on the interaction mode (voice, graphical, gestural, text), the proposal will be presented to the user in the appropriate manner. Each device or computer will have a different way of doing this which is dependent on the interface/OS.
Let’s look at a textual output. When I make the first selection, ‘Share’, another panel in the user interface will open, this will present the proposal based on past actions. If there was no past action with a close enough match, the proposal is presented in stages. This could be a simplest form:
Of course, it would look much better and follow the GUI L&F of the host device (Android, iOS, Windows, …). In a responsive design the proposal component would be vertical in a portrait orientation.
The fields on the Proposal will be links to the associated field’s data type: email address, URL, phone, and so forth. This gives the user a shortcut to invoke the registered application for that data type. In the above example, if I am not sending to Mary, I just click on her name and enter the contacts application and/or get a list of the most likely person(s) I am sending the web page to (based on web page content, URL, etc.). Also, if I am not sending an SMS message, when I click something else, like email, the proposal changes accordingly. When I send email, I am generally sending to a co-worker, for example.
To present an analogy of a similar approach, in Microsoft’s Outlook application one can create rules that control the handling of incoming email. A rule has many predefined actions in the rule domain specific language (VB code in this case). See figure 3. Of course, the Outlook rule interface is not proactively driven. You could select the same options a million times and the interface will never change to predict that.
A proposal is an automatically dynamically generated rule whose slots are filled in by probabilities of past action. That rule is translated into an appropriate Proposal in the current UI mode. When that rule is triggered, the user agrees with the proposal, the associated apps that perform the desired task are activated.
Image created with Dia
Potential techniques that could be used are:
- Machine Learning / AI
- Behavior Trees (BT)
- Bayesian Nets (BN)
- Case-based Reasoning (CBR)
- Case-base Planning (CBP)
- Hierarchical Finite State Machines (HFSM)
- Ad-hoc data structure with lookup
Predictive interfaces are not a new idea. A lot of research has gone into its various types and technologies. Amazingly in popular computing systems, these are no where to be found.
Interestingly, Games are at the forefront of this capability. To provide the best game play creators have had to use applied Artificial Intelligence techniques and actually make them work, not fodder for academic discussions.
Even Microsoft has had a predictive computing initiative, “Decision Theory & Adaptive Systems Group”, and had efforts like the Lumiere project. Has anything made it into Windows? Maybe the ordering of a menu changed based on frequency.
I came up with this idea while using my Samsung Galaxy Note smartphone or “phablet”. Using the same phone I brainstormed the idea. Here is one of the diagrams created using the stylus:
There are research and commercial efforts to create and sometimes monetize a Proactive User Interface.
- Quick Access in Drive: Using Machine Learning to Save You Time, https://research.googleblog.com/2017/03/quick-access-in-drive-using-machine.html
- “The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users“, http://research.microsoft.com/en-us/um/People/horvitz/lumiere.htm
- “Web based evaluation of proactive user interfaces“, http://atlas.tk.informatik.tu-darmstadt.de/Publications/2008/SchreiberEtAl2008.pdf
- “Introduction to Behavior Trees“, http://www.altdevblogaday.com/2011/02/24/introduction-to-behavior-trees/
- “A Comparison between Decision Trees and Markov Models to Support Proactive Interfaces“; Joan De Boeck, Kristof Verpoorten, Kris Luyten, Karin Coninx; Hasselt University, Expertise centre for Digital Media, and transnationale Universiteit Limburg, Wetenschapspark 2, B-3590 Diepenbeek, Belgium; https://lirias.kuleuven.be/bitstream/123456789/339818/1/2007+A+Comparison+between+Decision+Trees+and+
- “Understanding the Second-Generation of Behavior Trees – Game AI“, http://aigamedev.com/insider/tutorial/second-generation-bt/, accessed 10/16/2012.
- “On-line Case-Based Planning“, http://www.cc.gatech.edu/faculty/ashwin/papers/er-09-08.pdf, Santi Onta˜n´on and Kinshuk Mishra and Neha Sugandh and Ashwin Ram, CCL, Cognitive Computing Lab, Georgia Institute of Technology, Atlanta, GA 30322/0280, USA
- Blackboard system, http://en.wikipedia.org/wiki/Blackboard_system
- “An introduction to Outlook rules“, http://zatz.com/outlookpower/article/an-introduction-to-outlook-rules/
- “blackboardeventprocessor“, http://code.google.com/p/blackboardeventprocessor/wiki/BlackboardConcepts
- “Automatically Generating Personalized Adaptive User Interfaces“, http://www.youtube.com/watch?v=ODrE7SodLPs&playnext=1&list=PL7E1AF55BC556E56B&feature=results_video, Krzystof Gajos, Stanford University Human Computer Interaction Seminar (CS547).
- “Halo Statecharts“, http://gram.cs.mcgill.ca/statecharts/index.php
- “Decision Making and Knowledge Representation in Halo 3“, http://www.bungie.net/images/Inside/publications/presentations/publicationsdes/engineering/nips07.pdf
- Progressive User Interfaces
- Eight Principles of Natural User Interfaces
- Phone: Samsung Galaxy Note SGH-I717
- Android 4.0.4 Ice Cream Sandwich
- Host: Windows 7 Professional 64bit.
- PC: AMD quad with 8GB ram.
- Brain: Belonging to carbon-based life form, Earth, Homo sapiens sapiens.