Authors: Russell Leidich
We have at our disposal a wide variety of discrete transforms for the discovery of "interesting" signals in discrete data sets in any number of dimensions, which are of particular utility when the default assumption is that the set is mundane. SETI, the Search for Extraterrestrial Intelligence, is the archetypical case, although problems in drug discovery, malware detection, financial arbitrage, geologic exploration, forensic analysis, and other diverse fields are perpetual clients of such tools. Fundamentally, these include the Fourier, wavelet, curvelet, wave atom, contourlet, brushlet, etc. transforms which have churned out of math departments with increasing frequency since the days of Joseph Fourier. A mountain of optimized applications has been built on top of them, for example the Fastest Fourier Transform in the West[1] and the Wave Atom Toolbox[2]. Such transforms excel at discovering particular classes of signals. So much so that the return on investment in new math would appear to be approachingzero. What's missing, however, is efficiency: the question must be asked as to when such transforms are computationally justifiable. Herein we investigate a preprocessing technique, abstractly known as an "entropy transform", which, in a wide variety of practical applications, can discern in essentially real time whether or not an "interesting" signal exists within a particular data set. (Entropy transforms say nothing as to the nature of the signal, but merely how interesting a particular subset of the data appears to be.) Entropy transforms have the added advantage that they can also be tuned to behave as crude classifiers – not as good as their deep learning counterparts, but requiring orders of magnitude less processing power. In applications where identifying many targets with moderate accuracy is more important than identifying a few targets with excellent accuracy, entropy transforms could bridge the gap to product viability. It would be fair to say that in the realm of signal detection, discrete transforms should be the tool of choice because they tend to produce the most accurate and well characterized results. But processor power and execution time are not free! Particularly when, as in the case of SETI, the bottleneck is the rate at which newly acquired data can be processed, a more productive approach would be use to cheap but reasonably accurate O(N) transforms to filter out all but the most surprising subsets of the data. This would reserve processing capacity for those rare weird cases more deserving of closer inspection. I published Agnentro[3], an open-source toolkit for signal search and comparison. The reason, first and foremost, was to support these broad and rather unintuitive assertions with numerical evidence. The goal of this paper is to formalize the underlying math.
Comments: 37 Pages.
Download: PDF
[v1] 2017-05-11 21:51:26
[v2] 2017-10-24 23:15:07
Unique-IP document downloads: 270 times
Vixra.org is a pre-print repository rather than a journal. Articles hosted may not yet have been verified by peer-review and should be treated as preliminary. In particular, anything that appears to include financial or legal advice or proposed medical treatments should be treated with due caution. Vixra.org will not be responsible for any consequences of actions that result from any form of use of any documents on this website.
Add your own feedback and questions here:
You are equally welcome to be positive or negative about any paper but please be polite. If you are being critical you must mention at least one specific error, otherwise your comment will be deleted as unhelpful.