wiki.webvm.net/ webvm/ WebVM whitepaper

WebVM whitepaper

Unlocking the power of the mobile Web

Introduction

WebVM is an initiative by Aplix to improve the mobile web. This whitepaper sets out to explain why we think WebVM is needed, what problems it solves, and how it works.

The growing adoption of 3G broadband cellular services, and the recent revolution in the power of web-based applications on the desktop, are combining to reinstate belief in the mobile web and stimulate the creation of new mobile-oriented services. In addition, the hugely impressive iPhone has set a new standard for interactivity and functionality in mobile web applications.

In parallel, however, there is a growing recognition that mobile applications are not just about interactivity. As was evident years before in the context of mobile Java environments, mobile apps often depend on the ability to access services or data associated with the phone platform - be it access to information about the phone’s location, or access to personal or other context information, or use of the phone’s built-in messaging services - and application environments need to provide access to these services. This represents a problem because there is no standardised mechanism for a web application (i.e. an app based on html+javascript, or svg+javascript) to gain access to these services.

There have been experimental approaches to addressing this need by several companies but none has emerged as a serious candidate for standardisation. In addition to having limited adoption, these existing efforts universally fail to deal adequately with all of the use cases, and with the attendant security requirements, that are foreseen. There is therefore continuing fragmentation while piecemeal and proprietary solutions are coming into the marketplace.

WebVM is a new approach to the problem. It is based on some unique insights into the requirements and possible solutions, coupled with significant enabling technology. In this paper we explain in detail the needs that WebVM aims to satisfy, how it works, how it will be productised for the market, and how we plan to support developers in the wider content ecosystem.

The need

Web-based applications and services are assuming a central role in the commercial and technology landscapes. Extending these to mobile is the next frontier of opportunity

Everyone is talking about the unparalleled power of the web and its ability to connect individuals, form communities, and mediate between service providers and their customers. This power is changing the technology landscape for nearly everyone who has a PC.

There is naturally a desire to have that same power extended to mobile. After all, a mobile is an inherently connected device and there are already more phone subscribers than there are internet-connected PC users. In theory, there is huge potential in the extension of web technologies and services to mobile.

The reality falls a long way short of that promise right now. Even in the most advanced markets (e.g. Japan) the mobile web is a shallow shadow of the mainstream web. In other markets the situation is worse; the vast majority of sites and services make no concession to mobile clients, and are unusable or do not work at all. User experience is poor; accessbility problems - already significant - are magnified; content and devices do not interoperate or perform well; and in certain respects the open ecosystems of the regular web are simply not there. These basic problems began several years ago with the first introduction of WAP, and although the industry is working to solve them, progress is painfully slow.

Web-based implementations are set to displace resident applications for a wide range of applications and services

Now, however, there is an attitude shift towards the role of the mainstream web, with the widely held view that it will soon become the dominant way to deliver services to the desktop. The major technology players (Google, Adobe and Microsoft) have all made very significant announcements recently about technologies to support “Rich Internet Applications”. The web is changing from being one medium for service delivery to the medium for service delivery. Corresponding to this, there is an attitude shift towards the role of the mobile web. Soon, as with all desktop technologies, it will come to mobile in a big way. There is still much work to do, but it is widely believed eventually to be inevitable.

As this happens, web applications will increasingly displace resident (Java or native) applications. There are compelling advantages that make web applications so powerful. Some developers claim that they can be much more productive with web development technologies - especially if they are adapting an existing website aimed at the desktop. For some applications, the power comes from the ability to hook up with existing services already on the web - look at all of the new applications that have been developed, for example, based on the Google maps API. Many believe that web apps scale better, because you can access a thousand different websites, but there would be no way you could install a thousand different clients onto your phone.

The killer advantage of web applications is that every single day you can improve the user experience

The issue that service providers cite most often, however, is this: every single day you can improve the user experience. The user does not have to install a new client to get the benefit - it’s just there the next time he visits the site. This is a huge deal for users, service providers and developers and is one of the practical issues that is fuelling the proliferation of web-based services and applications on the desktop.

In comparison with native and Java applications, however, the mobile web has its shortcomings. To make web-based services possible or useful on mobiles, there are several broad needs that are not satisfied by current technology or standards. Each of these needs is examined in this section.

The need for device APIs

Javascript APIs for platform services are needed by many web apps to take advantage of the mobile context

The very connection that potentially has the most value - between the web application and the features of the phone handset such as location or the user’s PIM data - is missing. Java APIs to access these platform features have existed for some time, but right now there is no standardised way for a web-based application to get access to those services. This is the biggest single barrier to the development of the kinds of services that the mobile web has the potential to deliver.

The connection a web developer would want would be a collection of javascript APIs to device services. Right now, the picture is confused.

There is no consensus as to how these APIs should be provided

First, there are multiple ongoing initiatives. Looking at standards and industry bodies, there is an effort by the W3C, some (early exploratory) work within the OMTP, and discussions in the OpenAjax Alliance. In parallel, there are actual deployments by device and browser manufacturers of proprietary APIs.

Existing initiatives are providing APIs that are inflexible and only able to address installed widgets, not conventional web applications

All of the existing efforts by manufacturers are native implementations, within the browser, of APIs that are then frozen when the device ships. Freezing APIs in this way is limiting at any time, but especially limiting at this early stage of specifying and implementing these Javascript device APIs.

A further limitation is that, for security reasons, these APIs are often only made available to “web applications” (or “widgets” that are persistently installed in the phone. This undermines the usefulness of the device APIs - bearing in mind that one of the key attractions of web technology is the ability to upgrade and improve services continuously, without requiring the user to reinstall an application.

It is worth also looking at what Android is doing. Android does not itself specify any device APIs - it provides access to device features indirectly by providing a language-level mechanism, whereby a Java programmer can specify and implement (in Java) any API, but any object implementing that API can be attached (as a javascript global) to a webview window. However, this mechanism is not accessible to web applications, and is only available in WebViews that are instanced within the context of a resident (Java) application.

The security need

Security frameworks are needed to govern access to device APIs to prevent viruses or other malware

Disparity in approaches and inconsistent APIs are one problem. However, there is an even bigger hole: security. Without effective security controls, these device APIs have the potential to be the enablers for the most virulent and intrusive viruses yet seen. None of the approaches discussed above has made any genuine headway in solving the security problem. Addressing this in an effective but usable way is critical to the viability of any solution to the platform API requirement.

The security requirements are complex, and a framework needs to be able to deal with a wide range of competing requirements, including:

The fragmentation need

Device API fragmentation is already occurring in the marketplace

A further issue in the creation of device APIs is fragmentation. We have already discussed how there are several parallel initiatives to define javascript APIs for location, or for access to PIM data. There will soon be devices available on the market supporting each of these different APIs.

It is tempting to think that the solution to this fragmentation issue is for a standards body to create standardised APIs that all devices and browsers can implement. Is this the right approach? In fact our belief is that this is exactly the wrong approach. There are really two problems with it:

JavaME shows us that proactive and centralised standardisation by an industry body will not avert fragmentation

The historical development of J2ME exemplifies both of these issues. Ironically, the JCP standardisation efforts themselves became the cause of much fragmentation. Due to cost and the risk of delay, manufacturers and operators do not incorporate new APIs unless there is a clear commercial need; and when they do, they all make different choices as to which capabilities to include. Prior to standard APIs being available, operators and manufacturers still create private APIs, which must then be supported on an ongoing basis and contain a slightly different feature set from the standard so content cannot easily migrate to use a new API.

Decentralised API definition can be a better way to proceed, provided mechanisms exist to manage dissimilarity and evolution of APIs

Although standardisation of APIs is beneficial, it is also important to allow APIs to be defined and implemented independently, for example by content developers, service providers, manufacturers or network operators. Although at first sight this would appear to create fragmentation, it doesn’t provided that APIs are deployable (meaning that they can be downloaded to a device after manufacture) - and are modular (meaning there is also a way of writing and deploying libraries or wrappers that can implement one API on top of another). If these two conditions are met, it is not necessary to wait for a centralised definition of any given API because developers can adapt to different device features and APIs without creating an explosion in the number of platform variants that need to be maintained.

Summary

WebVM is only able to fulfil the needs of the market if it can simultaneously satisfy the needs of service integrators, users and the wider ecosystem

So, to summarise the motivations for WebVM:

but at the same time:

This paper explains how WebVM addresses these needs.

How WebVM works

WebVM plugin

The central piece of WebVM technology is a browser extension that bridges between javascript and the phone’s Java runtime

What WebVM provides is a connection between the web application environment (ie the browser, usually) and the Java runtime that is present on nearly every handset today. What it allows the web developer to do is deploy a Java library along with the web application - so the code in the web application can make calls to the Java library. The real power of this comes from the fact that the Java library can access platform features, through a wide range of existing standardised Java platform APIs defined as profiles (commonly known as JSRs). There is a multitude of profiles already deployed on most phones that allow a Java library to access:

… and many more.

WebVM integrates with a browser just like any other plugin

The concept behind WebVM is really quite simple. It hooks into the browser (or other environment - it’s not limited to the browser) in the same way that a traditional browser plugin works.

A web page invokes WebVM by referencing an asset of a particular type - in this case a Java library we’ll call a WebVM control - and the browser invokes a specific registered handler for that content type (in the same way as it would for a particular image or media type). In our case the registered handler is the WebVM plugin. This instances a Java VM, which loads the library referenced by the web page.

Since modern browsers allow plugins to be scripted just like any other element of a web page, WebVM is able to act as a bridge between the JavaScript and Java languages

The WebVM plugin responds when scripts in the web page make function calls on it - in fact what it does is forwards the function calls to the Java VM, which in turn makes corresponding calls into the Java library. In effect, therefore, the WebVM plugin allows the API exported by the Java library to be made visible to the web app programmer.

Obviously, the WebVM plugin takes care of all of the details associated with mapping language types between JavaScript and Java, handling language exceptions and other errors when they occur, and so on. It also handles the (extremely important) issue of mapping the security context of the web app to a security context for the Java VM, so arbitrary web pages do not get to make unauthorised calls to sensitive Java APIs.

WebVM security system

The viabilty of WebVM depends on being able to define a security policy that makes it possible to expose device APIs whilst being able to control exposure to risk of compromise. Without this, device API access would allow hostile sites to compromise user data, expose the user to additional cost, proliferate viruses, or theft of personal data.

The security requirements of WebVM mean that its security infrastructure needs to be richer than the familiar models from JavaME or the smart phone operating systems

Of course, security models already exist for Java ME and for other environments that permit applications to be installed on a phone - for example the “Symbian Signed” system for Symbian and the “Mobile2Market” system for Windows Mobile. However, web applications present a new set of requirements and it is not as simple as replicating these security models for WebVM.

There are also security systems built into current desktop browsers; but these also have issues when examined in the context of the functionality offered by WebVM and the unique requirements for mobile; they are inflexible (for example, being unable to control permissions at an appropriate level of granularity), differ substantially between the browser frameworks, and are poorly understood by users. What is required is a new approach to the security policy.

In defining this security framework, a number of issues have to be addressed.

JavaME security (as defined in MIDP2) deliberately simplifies the problem but this means it cannot scale to become a trust framework for the internet

First, there is the issue with MIDP2 security (and certain other similar frameworks) in that it confuses authenticity and trust. Usually, a cryptographic signature (eg on a JAR file) is used to verify the authenticity of the signed entity - ie that it genuinely was authored by the party that is advertised as its author. Once the identity is verified, a policy can be applied to assigned a level of trust, and a set of associated permissions, to that entity.

However, in MIDP2, the assigned trust and permissions are not based on the verified identity - instead, they are based on the owner of the root certificate that was used in the signature verification. This system prevents any truly scalable security policy being defined and implemented.

The WebVM security model must be able to deal with multiple different systems of identity, and multiple realms of security-relevant actions

The next issue is that there are complex identities - for example a object embedded in a web page has its own identity, but its security context also includes the identity of its containing page (ie its “referrer”) Also, there are multiple identity systems involved - including the Distinguished Name of a code signer and the domain of a web page.

Next there is the issue of having multiple namespaces of security-relevant actions. There are the actions that Java code might attempt that correspond to MIDP2 permissions, but also security-relevant actions in the browser (eg opening a popup, executing Javascript, or storing a cookie). Finally, there are actions that are part of the operation of WebVM itself that are security-relevant - such as the act of binding a web page to a given WebVM library.

Each of these security-relevant actions corresponds to a property that can be controlled by a security policy - ie a permission. These multiple permissions namespaces must all be addressed by the framework.

A web application, or parts of it, can validly be run remotely or installed locally, and so multiple permutations must be handled by the security model

Finally, the security framework must take into account the fact that there are multiple provisioning systems in place, and it must be possible to establish effective and workable policies for locally installed web applications (or “widgets” but also for web sites loaded in the browser in the usual way.

High-level security models

There are two ways that the security system can assign trust to the different working parts of a web application

WebVM allows code running in a web page (whether locally or remote) to call a Java library and, ultimately, to perform actions on the underlying platform via Java platform APIs. The WebVM security framework has two different ways of treating those actions of the library in relation the identity of the library and the identity of the containing page.

The “pass-through” model treats the Java library as a transparent extension of the web page from a trust perspective

The first, and simplest, model simply treats the WebVM library as a transparent extension of the containing page. All security-relevant actions attempted by the Java library are regarded by the framework as having been attempted directly by script running in the containing page. Therefore, it is the script’s identity (which, under the browser security model equates to the page’s identity or origin) that is used in deciding whether or not to permit the action.

The “trusted subsystem” model places specific trust in the Java library to isolate the web page from certain sensitive functionality

The second model aims to deal with the situation in which the WebVM library uses certain low level primitives to make a specific set of higher-level services available to calling scripts. The library is trusted (to some degree) not to abuse the primitive operations it has access to, and therefore the full generality of those lower level primitives cannot be exploited by scripts. For example, the library might use SMS to interact with a specific service - it only sends to a specific known address and can be trusted not to spam the SMS inboxes of other addressees in the user’s phonebook. The trusted library then exposes a different service to the enclosing app - say an SMS voting service. The library becomes a trusted subsystem, shielding the platform from abuse by web apps that make use of it. This is the trusted subsystem model.

With the trusted subsystem model, the WebVM library and the containing page are considered as having separate identities. The identity of the library is the relevant identity when determining whether or not to permit security-relevant actions at the MIDP level; the identity of the containing page is then relevant to:

The trusted subsystem applies in situations when the Java library is signed and verified and installed on the phone

In order for the trusted subsystem model to apply, the library itself must be signed and verified, and the library must be installed locally on the device.

Formal security framework

The underlying system framework is very general and highly configurable

The WebVM security framework is formally defined based in a definition of:

A typical configuration would be based on a series of “trust zones” which are collections of web sites with certain permissions

In a practical configuration, however, this full generality of the flexibility of the framework is not exposed, but it would be used to configure a policy as illustrated here.

A series of “trust zones” define a default set of permissions for websites in different categories (such as trusted sites, restricted sites, etc) as with IE. The management of these zones is such that a specific sites resolves to a unique trust zone.

Certain specific sites may have additional specific permissions granted - these might, for example, be those instances where the default policy for a zone required that the user be promted, but the user requested that the permission be assigned permanently.

Installed and trusted WebVM modules would typically be presented to the user as ‘trusted web extensions’

Next, there will be security configuration belonging to those WebVM libraries that operate under the trusted subsystem model. These might be presented to the user as “trusted web extensions” that individual sites can make use of with the user’s permission. These would have their own configured policy (indicating which low-level operations they are permitted to perform), and are able to present their own set of higher-level security-relevant actions to invoking pages. These actions themselves can be the subject of a security policy, governing which web sites are permitted to invoke those operations, and whether or not interactive confirmation is required.

WebVM module system

WebVM includes a module system that takes care of the loading of the APIs required by any given web application

In practice, a web application developer will not want to develop a Java library each time a new web page wants to access the location data (say); and a user will not want to install that library each time he visits a new site that uses it. (That would take away from one of the main aims of web apps, which is to avoid the need for the user to perform this kind of installation.) So, WebVM expects in fact that a series of JavaScript APIs will be developed, and each of them only needs to be deployed to the device once, no matter how many different web sites make use of them. So WebVM is built so that it supports the deployment of JS APIs (which are a combination of a Java library, and a JS wrapper that instances the library), and the management and coexistence of those APIs.

Of course, within a browser it is already possible to reference a JS API simply by referring to it in a tag. However, referencing many JS modules this way can be problematic when there are complex apps that could give rise to multiple inclusion, conflicts in the global namespace, load-order dependencies between modules, and contention for certain events (such as the onload event).

The community is beginning to address this problem by defining frameworks to allow modules to be developed independently and then combined within an application without conflict. WebVM includes such a framework, which additionally takes care of the installation of Java libraries, and persistent storage of permissions associated with those libraries.

With this module loader, applications do not even need to know that they are using WebVM - they just see the Javascript APIs they need

With this framework in place, there are further clear benefits to WebVM. The biggest single benefit is that these new APIs (say a location API, or a PIM API) do not need to be embedded in the phone and, more important, do not need to be defined by a single committee as is the case with JSRs in Java. Now, any interested party can define their own APIs to suit their own purposes, so long as the underlying functionality needed is available via a Java profile. These APIs can evolve to meet changing requirements.

Any Javascript API can then be deployed dynamically to the phone - whether it is a standardised API or one defined privately

Of course, it would be great if industry bodies got together to define useful JS APIs - and some are already doing this (like the W3C). However, exploiting WebVM does not have to wait for this to happen. APIs can be created and deployed by publishers, carriers or manufacturers as needed, and superseded by standard APIs once they become available.

WebVM SDK

WebVM includes comprehensive and professional IDE support

Developing for WebVM can be thought of as two essentially separate activities. SDK support is available for each of these, based on the Eclipse ATF (AJAX Tooling Framework) and EclipseME (JavaME) environments.

Most application developers will be able to use WebVM APIs in their IDEs without having to know about the underlying implementation technology

The primary activity is development of web applications (ie websites, or html/javascript applications packaged as widgets for local installation) that use pre-existing javascript APIs. The core WebVM technology is already accompanied by a series of APIs that are deployable, and as WebVM becomes adopted it is expected that further, more specialist APIs will become available.

Developers are then simply developing web applications in the conventional way, using whatever tools they choose. A WebVM plugin exists for the desktop that works with Safari (Webkit), Mozilla and Opera, and this will work with any web development IDE based on any of these browser cores. Of course, many of the WebVM-deployed APIs will only work in a desktop development environment if the PC is connected to a phone to get access to the relevant phone-specific functionality.

WebVM javascript APIs are expected to provide API Metadata in the OpenAjaxAlliance API metadata format, to allow those APIs to be seamlessly supported in IDEs such as the Eclipse ATF.

In order to simplify the creation and debugging of WebVM APIs themselves, an integrated set of Eclipse tools will be provided allowing debugging in both Java and javasript environments

The second kind of WebVM development is the development of APIs to be deployed with WebVM. This involves a combination of javascript and Java development. To support this development, WebVM includes an Eclipse plugin that supports all of the elements of creating and debugging a WebVM API, including:

Deploying WebVM

A JBlend extension

WebVM is usually deployed alongside JBlend, and is ported and integrated in the same way

The WebVM plugin is closely tied to JBlend, Aplix’s JavaME runtime. As such, WebVM is mainly intended to be deployed alongside JBlend, integrated with a phone’s system software and embedded in the phone at manufacture. WebVM represents a very small additional porting and integration requirement in comparison with the main porting and integration activity for JBlend. Like JBlend, WebVM is structured with a porting layer as an independent library from the main plugin code, and therefore it is straightforward to create a small self-contained project to create and deploy the porting layer implementation on any given platform.

Browser integration

Most AJAX-capable browsers will support WebVM without modification

WebVM also requires integration with the browser (or other widget/SVG runtime). WebVM is already available pre-integrated with the primary industry plugin API, the Netscape Plugin API (NPAPI), and is also available as an ActiveX control for use with PocketIE. With many browsers, therefore, no additional integration effort is required.

An SDK is available to support integration with other environments that are not supported already

However, some browsers, and most other target environments, do not support NPAPI. In these cases, it is necessary to implement a porting layer between WebVM and the browser. An SDK, including documentation and header files, is available to support this.

WebVM has been designed to place minimal requirements on the browser or runtime environment; at a minimum, the environment must support scripting of plugins, and must provide hooks that allow plugins to cause javascript to be invoked from the plugin. However, very low-end browsers are likely to be unable to support WebVM. Please contact Aplix to verify that a given browser will be able to support WebVM.

WebVM-JNI

WebVM is available as a standalone plugin for use with JNI-capable VMs such as CDC

A version of WebVM is also available that can work with any conventional JNI-based Java VM, which includes most commercial CDC VMs. This allows for interoperability of WebVM content between CLDC and CDC-enabled phones.

Validation

A new port can be validated using the WebVM test suite

A new port of WebVM can be validated using an extensive set of tests that are available as part of the licensed WebVM porting SDK.