|
Figure: Standardized OOM representation ("fingerprint") of a stochastic system. |
Subscribe to the OMK-announce mailing list and stay informed on OMK.
Observable Operator Models (OOMs) are generative systems which can model stochastic time-series data and sequences. In this regard OOMs are comparable to Hidden Markov Models (HMMs), but OOMs are more expressive. HMMs are currently widely used e.g. in biological sequence modeling, engineering or speech processing, where one wishes to model stochastic systems that have memory or other context effects.
topOOMs, although superficially similar to HMMs, spring from a very different mathematical idea. While usually stochastic time series are mathematically modeled as a trajectory in some state space - where observations correspond to locations in that space -, OOMs conceive stochastic trajectories as a sequence of operations, i.e., observations correspond 1-1 to mathematical actions. Hence the name, "observable operator models". It turns out that every stochastic system can be modeled with linear observable operators acting in an abstract vector space (not necessarily finite dimensional). This linearity results in a transparent mathematical theory of stochastic systems, which brings the tools form linear algebra to the world of stochastic processes.
topThe Observable Operator Modeling Kit (OMK) brings OOMs to the real-world. It includes a powerful learning algorithm for estimating OOMs from training data. The OMK was originally developed at the Fraunhofer Institute for Autonomous Intelligent Systems (Fraunhofer AIS) and generously donated to the open-source community.
topThe OMK was designed and developed by me, Tobias Oberstein, as part of my master thesis during 10/2001 - 12/2002 and additional development on important pieces of the learning algorithm ("Phase 2", see the OMK learning algorithm) was done by Dr. Klaus Kretzschmar. The OMK currently consists of more than 20 thousand lines of C++ code, written in a mixture of object-oriented and generic programming style for good domain abstraction while retaining maximum efficiency.
topThe model class of OOMs is richer: there are OOMs (of finite dimension) that can't be mimicked by HMMs (with finitely many hidden states). OOMs form a deep and nice connection between linear algebra (LA) and stochastic processes. This allows one to apply the tools of LA to the theory and practice of stochastic processes and gives many nice insights. There are justified hopes that training OOMs can be done more effectively than how HMMs are trained today.
topHMMs have "structure" (hidden states + emission distributions) which may "fit" an application domain. That is, HMMs like OOMs define probability distributions, but additionally, you may be able to interpret the hidden states in terms of application domain concepts in certain situations. It really boils down to the question if your application requires you to have some kind of hidden states that you're able to interpret. If you're "only" interested in learing probability distributions from data, than that is irrelevant. HMMs are widely know and well established both in the theoretical communities and application domains.
top
The OMK itself is
OSI Certified Open Source Software, generously donated by the
Fraunhofer Institute for Autonomous Intelligent Systems (Fraunhofer AIS)
to the open-source community. The OMK is licensed under the
BSD-license.
Thus, it can be incorporated in a wide range of software, ranging
from GPL licensed software (the BSD-license without the so-called
advertising clause is GPL-compatible) to undisclosed, commercial
software without requiring any royalities. Though not required by
the BSD-license, we would welcome if you donated fixes and improvements
to the OMK itself back to the community.
Important: the OMK currently depends on libraries
not free for commerical use.
The OMK depends on the following libraries
LAPACK and BLAS are widely available as free Fortran source and from
hardware vendors (e.g.
Intel MKL) in optimized, but compatible versions.
The PORT library as a complete package is available under a non-exclusive,
non-commercial limited-use source licenes at no cost from
Lucent Technologies. OMK uses only
the non-linear optimization features of the PORT library. As can be
read here,
some modules of the PORT library are in the public domain. We did not
have time to check if those modules required by OMK fall into the
public domain.
The Expokit package is free for non-commercial purposes, but approval must
be sought for commercial purposes.
The OMK will be released in a complete package, suitable for end users, with binaries and samples included in the near future. Until then, you may have a look the OMK project page at Sourceforge, where you will find the source code to OMK in the project's CVS repository.
top
(c) 2003, The OMK Project. All rights reserved.
send comments
$Id: index.html,v 1.9 2003/08/27 12:57:51 toberstein Exp $