5. Data Processing Infrastructures for Knowledge Filtering:
The TLSI
Approach
PAKM: First International Conference on Practical Aspects of
Knowledge Management
Basel, Oct 30-31, 1996
Dr. Andreas Goppold
Postf. 2060, 89010 Ulm, Germany
Tel. ++49 +731 921-6931
Fax: (Goppold:) +731 501-999
http://www.noologie.de/symbol06.htm
(URL)
5.0. Abstract, Keywords
Look-ahead and just-in-case principles: time factor in
professional knowledge work the most critical issue and prime optimization aim,
semi-automating search and retrieval process, minimizing access time at request,
off-line search, local intermediary storage.
Java, Virtual Machine design, script and query language,
search scripting, field-adaptable macro languages, shell and filter approaches
as derived from UNIX filter principles, on-line (real-time) programming,
implementing string- and fuzzy logic for searches, rediscovering ancient script
languages: APL (NIAL), MUMPS, SNOBOL. Optimal match strategies for UIL (User
Interface Language) and EUPL (End User Programming Language), token-type
conversions.
Database tasks: structured local and intermediary storage,
local storage of data sets, structural requirements of tree representation and
manipulation, graphical layout and manipulation of link structures; overview-
and fish-eye techniques for on-line search, abstracting, collating, ordering,
tabulating, printing search results, priority weighing functions graphically
applied, distribution fields, manipulation.
5.1. The problem situation of MUDDL for Knowledge
Work
The Internet and the WWW pose the specific problem situation
of information filtering in very large, dynamic and unstructured mazes of data.
With a slightly permuted word order, the relevant key problem terms are
abbreviated into the best-fit acronym MUDDL: Mazes of hyper-linked Unstructured
Dynamic Data that are very Large. Practical use yields a drastic difference
between the many euphorically declared projections of the potential of the
Internet and WWW, eg. as the "global brain" (Tom Stonier, FIS 96), and the
actual somewhat sobering experiences that can be had when actually trying to do
real knowledge work (KW) and get tangible and professionally relevant results
from consulting the Web. The proverbial problem situations are the famous
"lost-in-hyperspace" syndrome, or Robert Cailliau of CERN: "World Wide Spaghetti
bowl", and Ted Nelson: "the balcanisation" of the WWW.
5.2. Systematics of Knowledge filter problem factors: time, structure, and
content
Problem areas are those of unexpected and hard to calculate
cost factors incurred for non-linearly valuable results. This is exemplified in
the WWW and its typical serendipity effects - a situation closely akin to gold
mining: Tons of useless data dirt, and every now and then, an information
nugget. More systematically, the problem areas of MUDDL can be classed in those
main factors: time, structure, and content.
Nowhere else is the adage "time is money" more cutting and
more a truism than in highly professional knowledge work (from now on
abbreviated KW). The professional skill of a knowledge worker (KWr) consists of
these main factors:
1) in his/her ability to maintain and interconnect vast
networks of fact and process knowledge,
2) interrelate this to existing information bases, (in spite
of the computer and information revolution, the most important of these is still
the library and its catalogs, then commercial data bases, and then, more
hopefully, the Internet),
3) and apply this personal and data knowledge to problem
situations with vastly differing appearances, aspects, and shapes.
The value and relevance of KW stands in inverse relation to
the standardization of the problem space: the expertise of the KWr is the more
relevant and sensible the less standardized and structurable the given problem
situation is in the field. The value of the time of the KWr rises exponentially
with his/her ability to analyse the more intricate problem situations. The
current WWW and available browser technology are geared to serendipity users -
those browsing with leisure, who can tolerate long time delays for retrieving
marginal chunks of data from the net. The time to retrieve useful information is
essentially incalculable, depending on such factors as nesting depth of
hyper-link trees that must be traversed when following a lead, and network
bandwidth versus load factors, that vary from location to location, and from
time to time. This can put KWrs with the bad luck of residing in an unfavorable
area of backwater net technology and high load, into a harshly disadvantaged
situation: Compared to the U.S. and Scandinavia, the net bandwidth and access
delay situation in Germany seems to be dynamically approaching that of a third
world country. In such cases, it is intolerable for a professional KWr to
conduct real-time on-line searches at all, except in emergency cases.
Compared to the cost factor of KWr time, all other information
cost factors, like line-, connect-, and computer costs are negligible, even the
commercial data base connect costs of $10-100/hr are secondary compared to the
$100+/hr (upward unlimited) KWr cost. Compounding the problem is the fact that
no less-skilled labor can be substituted for KWr time. Further compounding is
the global data pollution situation that rises exponentially by the day, and
makes the precious skills of competent KWr's increasingly more valuable, taxing
all available potential much beyond the limits. (See Illustration
TLSI-1).
All subsequent KW problems are but different aspects of the
time problem.
Structure is defined here as a way of pre-arranging content at
the cost of additional storage space and storage time in order to save access
time. Cost tradeoff factors for structure are:
1) Cost for designing and keeping the integrity of the
structure
2) Storage cost for structure base elements
3) Access time cost for structure base elements
4) Update time cost for structure base
5) Maintenance and Redesign cost for structure base when
problem spaces change
6) Data loss cost caused by structure gaps, leading to
irretrievable data elements or exponential access costs.
The general working principle of data base design is to
provide methods for matching solution spaces to well known problem spaces.
Specifically, structured database design matches standardized request cases with
standardized data fields and retrieval routines. For these cases, such
successful methods like the relational data bases and SQL were devised, and have
proven their worth. As for the more recent cases of OO-data bases, there exists
as yet no equally logical foundation for powerful retrieval schemes that match
the structural power of the OO database.
While cost factors 2) to 4) are calculable, and dropping with
evolving computer technology, the hidden, and uncalculable, cost factors of data
base technology are points 5) (implicit 1) and 6) since the structure of a large
commercial data base in a working organization is next to impossible to change,
and there is no way of accounting for data (and profit) losses due to problem
6). The pitfalls of diverse schemes of MIS (Management Information Systems) and
CASE are due to this typical problematic of database structure. Largely
inevitable is the insidious built-in obsolescence principle of problem 5):
The succesful application of any ordering principle will, exactly by the
success of its application, invariably change the problem space.
MUDDL are based on the serendipity principle: the extreme of
totally unstructured data spaces of everything eventually connected to
everything else (sometimes euphemistically called "information" spaces - except
that this doesn't really "inform" anyone very reliably). So while everything in
the MUDDL can be somehow accessed eventually, the access time cost rises to
unlimited. A similar problem is well known in the programmer community as the
LISP syndrome: A LISP programmer knows the value of everything and the cost
of nothing. This, because the infinitely flexible method of LISP programming
leads to essentially infinitely incalculable processing cost (or time delays,
while the system is garbage-collecting).
There is the fundamental tradeoff of linear real-time search
access versus structured access. Structuring and re-structuring are only useful,
if access time is saved versus the additional expense incurred for the building
and maintaining of structure. In many cases linear searches are more
advantageous than providing structures. The UNIX Grep tool gives a case in point
how an extremely efficient search-and scan algorithm can match a wide range of
patterns that would otherwise require complicated databases. The legendary power
and simplicity of UNIX is attributed to clever exploitation of such innocuous
space/time/structure tradeoff principles.
The accounting measure for the usefulness of content of data
is the knowledge value of a given datum for the KWr while s/he is working on a
specific problem. To coin an aphorism in the vein of the well-known information
definition of Bateson: "Information is the difference that makes a
difference" the measure of pragmatic information is: "Information is the
content that makes the customer content". With regard to the extant
knowledge base of the individual KW and the specific problem at hand, this
factor is quite impossible to account, neither qualitatively, nor quantatively.
What can be accounted are the data processing factors: data match against
request, and availability in time. Typical problems of MUDDL are:
1) the non-existence of standard relations between keyword and
content
2) the time cost of data access due to nested, repeated, and
refined searches
which is usually maximal because requests are mostly made in a
state of emergency when time is at a premium.
5.3. Solution strategies
5.3.1. The Global Inverted data
base
The brute-force approach taken by Altavista
(http://altavista.digital.com
(URL)) is that of the inverted data base. Because
storage costs are falling, and 64-bit processor address space is keeping abreast
with the current size of the Internet data base, it is possible to build an
inverted data base of practically all the words in all the WWW texts on this
globe. This solution is in-between at best, since at current rates of growth,
the number of references to any keyword will soon oustrip storage capacity (or
already has). The more practical limit of this approach is the typical rate of
100 to 5000 "garbage" results vs. the one that is useful. The data access time
(and scanning/reading) cost for the user exceeds that of the finding of the
references by several orders of magnitude. Besides this, there is a high cost
factor of additional network load generated by the continuous update processing
requests of this global data base.
5.3.2. The Structure
Base
The approach taken by Hyper-G is that of separately
maintaining data and pointer structures. This is a great advantage compared to
the current state of the WWW and allows effective structure navigation and
maintenance without prior need to access the data. It alleviates many problems
for KW and poses its own difficulties, as practical experience shows: mainly in
the structural problem cases 1), 5) and 6) mentioned above, since the user
interface handling and updating of the graph data structures of Hyper-G is a
problem quite distinct from that of the purely mathematical description of a
graph. The machine requirement for Hyper-G installations is typically a factor
of 10 higher than WWW. Similar for the structure design and maintenance time
cost. In practical application, this sets a limit in the manpower for
maintenance. Even more than in conventional data base design, all the possible
uses and applications for the data can in no way be known beforehand, making the
solution feasible only in cases of large organizations where procedures and
requests are fairly standardized and distributed over a fairly uniform and
predictable user community. Here, the high cost of database maintenance can also
be justified. Literature: Maurer et al.
5.3.3. Agents, Broker,
Filter
Bridging the gap between the approaches of the global inverted
data base and the supplier-maintained "one-size-fits-all" data structure exist
the strategies known as Agents, Brokers, and Filters (from now on: ABF).
Example def:
... agent that has access to at least one
and potentially many information sources, and is able to collate and manipulate
information obtained from these sources in order to answer queries posed by the
user and other information agents (Woolridge and Jennings,
1995).
In the current context, ABF will be used as a generic term for
a class of solutions to adaptable and adapting intermediary agents acting as
interface elements between data providers and information
requestors. In this case, it makes no difference, if the ABF is a person, or
an organization, equipped with certain hardware and software, or whether it is a
set of software tools that reside at the location and on the computer of the
provider or the requestor. The terminological distinction takes account of the
often-overlooked fact that there is no such thing as
information-providing, at least not in the requirement space of KW. The
statistical definition of information as given by Shannon and Weaver might be
useable for technical signal processing, but has no application for KW.
Information in the KW case is strictly in the eye of the beholder, ie. it is
what informs the KWr. As indicated above, that is individually dependent
on what s/he already knows, and the requirements of the problem at hand.
Information, also defined as sheer novelty, has absolutely no relevance for KW
problem solving situations. The provider cannot do anything else than provide
the data, however pre-processed they may be. The various ABF approaches are
widely discussed in the literature (CACM).
The general structure of the ABF can be partitioned in four
components:
1) Data locator and access machine (finder)
2) Data extract machine (extractor, filter)
3) User profile machine (broker)
4) User profile / Data extract matcher (agent)
The use of the terms broker and agent varies in
the literature and is given here only as illustrative example. Also, 1) and 2)
or 3) and 4) are often grouped together as one. The distinction is difficult to
make in practice, since the components must cooperate closely to yield useful
results, and their interactions cannot be decomposed in simple hierarchical
structures.
5.4. The TLSI approach to ABF
The discussion in this paper covers the ABF approach taken in
the Leibniz TLSI system (literature: Goppold). The TLSI is based on a Virtual
Machine (VM) model similar to the SUN Java machine. In the present installation,
the crucial parts of the string and data base infrastructure, and the UIL / EUPL
solution have been implemented, like the SNOBOL processor, the MUMPS data base
engine, some components of the APL processor, and the hypertext and window user
interface. (See Illustrations TLSI-2,3,4).
5.4.1. Alan Kay's
prediction
and Java
The value of computer systems will not be
determined by how well they can be used in the applications they were designed
for, but how well they can be fit to cases that were never thought
of.
(Alan Kay, Scientific American/Spektrum, Okt. 1984)
When Alan Kay made this statement in 1984, no one exept maybe
he himself could have foreseen the rise of Hypermedia computing and the WWW ten
years later. In effect, the criterium he stated can be called the hallmark of
successful software systems.
SUN's Java development may be cited as a case in point,
because it is was salvaged from the limbo of an already aborted software
development effort. The TLSI application outlined here adds another twist to the
cases of unexpected uses that the very versatile principle of bytecode VM
technology used by Java can be applied to, of which their erstwhile creators
would have been very hard put to think of, and if someone had mentioned it to
them, they would have surely protested that intention most vigorously and
violently.
5.4.2. Promethean data processing: the
look-ahead and just-in-case
principles
The retrieval time delay cost of a MUDDL like the Internet is
such that the potential of local and intermediary storage must be utilized to
the fullest. This is the more feasible since disk storage cost is falling,
somewhat in proportion to the continuous rise in data quantities offered. A few
gigabytes of local storage cost no more than a few wasted hours of research time
of just one single KW. In case of an intermediary agent serving several KWs, it
is economically feasible to maintain storage in the order of up to 100 GB or
more. Data are stored according to the Promethean principle, or the
look-ahead, just-in-case principle, meaning that is far cheaper to keep a few
gigabytes of unused data around than to have to search in vain for a few days in
the case of a critical emergency situation. The venerable name Prometheus
is used because this greek term means, literally translated, "the
before-thinker". The currently available caching techniques of WWW browsers are
useless in such cases because they are strictly look-behind, and allow no
selective definitions for cacheing. Drawing again on mythological accounts, we
find the appropriate equivalence in the dumb twin brother of Prometheus, called
Epimetheus, "the after-the-event-thinker". He gained eternal mythological
fame since it was he who allowed Pandora to open her equally famous
box.
The process requirement of the look-ahead, just-in-case
principle is a watcher demon process that regularly monitors the data
sources for changes and updates the local data base. Since this demon needs to
watch only a range of selected targets, at intervals of days, or weeks, the
processing costs and net load are limited. The frequency of watch access is a
function of maximal possible cost due to outdated data. Only in a volatile case
like the stock market, where conditions change in a matter of hours, is this
strategy difficult to apply, whereas in most situations like the case of travel
agency schedules, the watch interval is by the week or the month.
5.4.3. Macro Script languages:
the missing link between compilers and WIMP
interfaces
The fine-tuning of look-ahead ABF search strategies
necessitates very powerful field-adaptable interactive script, query, and macro
languages. These requirements cannot be satisfied either by current compiler
based technology, nor by WIMP (Window, Icon, Mouse, Pointing) interfaces.
Standard compiler technology like C (and compiled Java) programming is unusable
for field-adapting search and evaluation script strategies. WIMP is also
problematic, because of the one-shot character of a WIMP interaction sequence. A
script derivation of WIMP interactions is possible, and moderatly usable, like
the one implemented in Apple's System 7 user interface transaction logging
facility. The difficulty for applying this principle to ABF applications is the
uncontrolled and open nature of the search space as much as the problem of
token-type conversion and higher level abstraction necessitated by the complex
ABF case. Other interpreted and interactive systems like LISP and Smalltalk are
useable, but need to have extensions for the complex data cases of ABF
processing.
The approach taken by the Leibniz TLSI is a revival of ancient
script languages: APL (NIAL), MUMPS, SNOBOL, combined with the UNIX shell script
filter and pipe principle. In the earlier days of computing, these script
systems were very popular with their user communities because they supplied very
powerful processing paradigms with easy-to use access interfaces for specific
data types: APL (NIAL) for numeric and character arrays, MUMPS for string and
data base handling, SNOBOL for string processing. The single-character operator
command structures of APL and MUMPS were in fact a direct translation of the
underlying byte code machine glorified as user interface, that gave these
languages their distinctive and elegant mathematical-looking write-only flavor
that was as equally cherished by their adherents as it was abhorred by their
detractors. On the upside of the tradeoff balance, these were also the most
powerful programming languages ever created, and it was not only possible but
very easy to write a data processing solutions with five lines, that would need
five pages with contemporary C or Pascal code, object oriented or not. In the
ABF application, the powerful string capabilities of SNOBOL are needed for the
complex context-dependent string and text search, while an approach derived from
the matrix handling model of APL is to be used for the more general data type of
graph traversal and search strategies. While in matrix processing, it is of no
concern by which order the elements are processed, this is very much of concern
in tree traversal. Only the combination of string and graph data model
approaches can yield a truly versatile and powerful ABF toolset. These
strategies can, of course, be implemented in any suitable interpreter model, be
it LISP, Smalltalk, or a variant of BASIC. In any case, the data processing
infrastructure, the libraries and data structure machines must have the minimum
processing power, regardless of the syntactic icing applied on top. The approach
taken with the TLSI model was chosen for reasons of flexibility: The TLSI is
dynamically user-modifiable, imposing no syntactic structure on the solution
space. Since the TLSI is a general implementation of a bytecode machine, it can
be, and has been, used to implement BASIC, LISP, MUMPS, or APL on top.
The pipe paradigm derived from UNIX filter principles is a
necessary ingredient for portioning the ABF processes into small and manageable
sub-tasks. In this special case, as for example backtracking, it must at least
be possible in principle able to preserve locally the contents of the pipe in
order to examine the contents. (See below). This in turn translates into the
necessity of a suitable hierarchically nestable data base
infrastructure.
5.4.4. Database
considerations
The power of approaches like APL and MUMPS is the promethean
look-ahead principle of virtual memory. Those parts of the data that were known
to be used most often were kept in RAM on a privilege basis. The system provides
all the necessary accounting, and the user does not need to bother with its
administration. This was an essential necessity in the days of the 32K to 64K
RAM mini computers on which these systems were initially developed, and they
provided solutions that were optimized and fine-tuned to an extent never
afterwards attained when RAM became cheap and system implementers chose the
sloppy approach assuming that all the necessary data are kept in RAM or be
provided by a virtual memory mechanism. In the case of Java, for example, even
though RAM allocation is automatic, the implementation of massive data
structures will cause a problem, when covering several MB to GB, and
systolically growing and shrinking in the process. OS virtual memory strategies
are always look-behind and often lead into the well-known problem of thrashing
as the system is continually swapping in and out the different portions of a
large data structure that does not fit into RAM. In such a case, the speed of
the computation is bounded by disk access time, ie. slowed by a factor of about
1000. Only a clever management of multi-level pointer access structures (as
implemented in MUMPS) is applicable here. And this is exactly the normal case
occurring in ABF searches, and must be accomodated. Further data base
requirements are:
5.4.4.1. Local link structure maintenance and database match for
http-addresses
The obvious problem of WWW documents with their embedded
http-keys must be addressed by an ABF system, and the separation of data and
structure must be done post-hoc. http-key storage is another requirement for the
local data base engine. A practical problem with intermediary storage is the
provision of a conversion table for http-addresses, to match the http pointer
structure to the underlying operating system and provide the interface to the
local file structure model. For portability reasons, this must be provided
transparently for all possible operating systems on which the ABF is supposed to
run, just as the Java VM must be able to run equally on all target machines
processors.
5.4.4.2. String- and fuzzy logic, linguistic strategies
The implementation of the basic set of linguistic strategies
for ABF, like synonym, homonym, antonym, phonetic variations, best-match and
near-miss etc. as well as their boolean operators, are an application case of
fuzzy logic and multi-level logic string-searches and matches. This translates
into a string data base problem that cannot be handled with most of the current
string library approaches that necessitate the explicit storage definitions for
single strings. A combination of SNOBOL string processor operators and MUMPS
data base model is currently the only and single available solution to
generalized unlimited depth string match operations that satisfy the
requirements of fuzzy logic string operations, because the string space is
dynamically allocated, with unlimited storage of intermediary results.
5.4.4.3. Data morphologies
Direct and indirect keyword searches of data bases are only
the most primitive and least interesting ABF strategy. A higher order data
morphology to be processed is for example the following search and scan
order:
Assume the pattern "qbv" as record
delimiter in the xyz data base, and filter out all the records with the
following pattern description:
Beginning with a ":", followed by at least
two uppercase ASCII chars, then at least three consecutive blanks, alternated
with two ASCII null chars (this throws off all C type string processors), and
have either "jwd" or "klm" in the last third of the
string.
Of course any such selection strategies can be stacked and
combined in boolean and fuzzy logic search as indicated above. While AWK and
PERL allow simple strategies with common CR delimited files of the UNIX
environment, they cannot deal with arbitrary delimiting patterns. For this, a
blocked file access needs to be implemented that circumvents the UNIX processes.
The next thorny case occurs when search conditions are not given as pre-input,
but derived from the data material itself:
If record n belongs to a pattern
class defined by some computed property xyz, then apply the pattern
transformation function jwd to all records n+y, with the value of y
computed from the pattern of "xyz".
5.4.4.4. Dynamic selection of context sensitive weight functions for priority
classing
A even more compounded case occurs when applying dynamically
computed priority classing weight function to search results. Here the results
of string processing are then processed numerically. In APL terms, this would
amount to making the result of level-n of an array computation the decision
element which of several alternative level-n+1 functions is applied. Again, any
approach below the formal power of APL expressiveness would be very time
consuming to implement in any other known programming language. The combination
of SNOBOL, MUMPS, and APL methods can only be achieved in a data processing
model where a seamless transition from one data model to the other can be
implemented with one or two lines of code.
5.4.4.5. Process monitoring and Backtracking
The search possiblities listed above are all prone to lead
into exponental search times, and limitless computing resource waste. Therefore
process monitoring strategies and backtracking have to be implemented.
Computationally, the processing requirement for this is tantamount to a special
solution of the Turing Halting Problem. There must be a subprocess that keeps
track of the intermediate results of searches and stops them if certain limits
are exceeded or certain success conditions are fulfilled. Depending on the
situation, a backtracking to specific exit points, and most important, specific
data structure states, has to be initiated. For this to work as general case,
the underlying machine must be able to read and analyze its own subroutine
stack, as well as constantly monitor its main data devices.
5.4.5. Miscellaneous search and data
strategies
Structural and graphical tasks in the layout of pointer
structure, tree representation and manipulation, time factors: fast scan, fast
tree traverse, collapsing and expanding of tree views, fish-eye techniques,
collating, ordering, tabulating.
The single-minded application of the hypertext principle in
the WWW is similar to the undertaking of a carpentry project with a saw as the
sole tool available. Because of unchecked and unmanaged implementation of this
principle world-wide, there is very little to be done to cure the problem at the
roots, leading to the well known problems of senseless fractioning and
thin-spreading of data across hundreds of linked mini-files, the
"lost-in-hyperspace" and "World Wide Spaghetti bowl" syndromes mentioned in the
beginning. Some of this can be remediated after-the-fact, and somewhat
cumbersome, but essential for KW. Some application cases will be illustrated:
The professional KWr needs to access the structure of data in priority to its
contents. This is the age-old lesson learned from 2500 years of literary
processing that seems to have completely gone lost in the current WWW craze. The
implementation model given by Hyper-G serves to provide many of the necessary
examples how to go about the task. For example a basic facility to automatically
extract the link structure from a collection of WWW pages, to construct a table
of contents and index, collate the single pages into a coherent volume, (a
Hyper-G collection), to provide navigational access along the structure path,
and concurrently update the view of the path structure depending on the position
in the graph. Some factors as yet missed in Hyper-G deserve mention.
5.4.5.1. Basic and extended strategies of overview
The need to keep the go-to and come-from window
in concurrent visual display (in normal hypertext access, the go-to window
usually superimposes or supersedes the come-from window, obscuring this vital
data connection). User-programmable layout definition stratagems for keeping
go-to and come-from windows in a defined arrangement on the computer screen.
Content-sensitive marking of windows as: keep in view. Application of
UNIX mode concurrent processing models to ABF to relegate specific windows and
vistas to different user processes residing on different terminals. This is just
a different application of the watcher demon process mentioned above.
5.4.5.2. Local compacting, overview, folding, and fish eye views
Whatever the practical reasons for spreading out the data over
hundreds of mini files, when it comes to the case of local intermediate storage,
it is much more sensible to compact these many files into one virtual contiguous
data model. By this, the connectivity of texts is restored, that had been lost
by indiscriminate application of the hypertext principle. In the WWW model,
hyper links are often applied where it would have been most sensible and useful
to introduce a hierarchy outline or folding level, eg. as it is implemented in
Microsoft Word. This gives a result similar to the juxtaposing of go-to
and come-from windows. Even though the HTML data format allows to define
several levels of headline as standard data type, no current browser available
makes use of this provision to fold the text on the headlines, as can be done
with Word.
5.4.6. Visual and Sound Pattern Matching
and Array processing
Bit map array manipulation: An extended APL case. While the
above cases of string and data morphology matching may seem to tax the limits of
current computing technology, there are applications waiting in store for future
multimedia data base filtering processes: the search and retrieval of Visual and
Sound patterns. Though this is well beyond the horizon of contermporary computer
aspirations, some of the data processing principles are quite simple to state,
even if difficult to implement. Visual pattern matching strategies involve a
generalized class of APL data structure manipulations. In order to match bitmap
reference patterns with actual data sets, various matrix transformations, like
squeezes and stretches, linear and non-linear, convex and concave, of the
reference pattern can be xor-ed onto the target data set, until a match is
made.
5.5. Literature
CACM, Dec 1992: Information Filtering
CACM, Jul 1994: Intelligent Agents
Goppold, Andreas: Das Paradigma der Interaktiven
Programmierung, Computerwoche, 24.8. und 31.8.1984
Software-Engineering auf Personal Workstations, Projekt
Leonardo-Leibniz 1988
Das Buch Leibniz, Band I - Aspekte von Leibniz, 1991
Das Buch Leibniz, Band II - Leibniz-Handbuch 1991
Leibniz: Ein skalierbares objektorientiertes
Software-Entwicklungssystem für Echtzeitanwendungen, Echtzeit '92,
Stuttgart 1992
Die Inter-Aktor Shell ACsh, GUUG-Tagung, Wiesbaden
14.-16.9.1993
Lingua Logica Leibnitiana: Ein computerbasiertes Schriftsystem
als Fortführung von Ideen der Characteristica Universalis von
Leibniz
Kongress: Leibniz und Europa, Leibniz-Gesellschaft, Hannover,
18.-23. Juli 1994, S. 276-283
The Leibniz TLSI: A secondary macro programming interface and
universal ASCII User Interface Shell for Hypermedia, CALISCE '96
Maurer, Kappe, Andrews, et al., various articles about
Hyperwave
ftp://ftp.tu-graz.ac.at/pub/Hyperwave/
(URL)Germany:
ftp://elib.zib-berlin.de/pub/InfoSystems/HyperWave/
(URL)
Stonier, Tom: Internet and World Wide Web: The Birth of a
Global Brain? FIS '96, Wien:
http://igw.tuwien.ac.at/fis96
(URL)
Woolridge and Jennings, 1995