Page 9 - Krész, Miklós, and Andrej Brodnik (eds.). MATCOS-13. Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science. Koper: University of Primorska Press, 2016.
P. 9
ptive Algorithms For Dynamic Programming With
Applications to Bioinformatics
[Extended Abstract]
Marko Grgurovicˇ
University of Primorska
Titov Trg 4
Koper, Slovenia
marko.grgurovic@student.upr.si
ABSTRACT It can be shown that certain dynamic programming algo-
rithms, including algorithms for the edit distance problem
In this paper we study a simple computation that lies at and algorithms used in geology [4] and in speech recogni-
the core of many dynamic programming algorithms, perhaps tion [3], can be reduced to this simple computation. The
most notably those for computing the edit distance between straightforward algorithm for computing Eq. 1 takes O(n)
two strings. Past solutions have assumed properties of the time per i, which amounts to a total of O(n2) time over all
input such as convexity and concavity and would not work i.
on general inputs. We obtain an algorithm that combines
the previous algorithms in a way that makes no assumptions Previous algorithms for this problem focused on specific
on the input, yet its running time depends on a measure of types of the function g(·) (usually referred to as the gap
“sortedness” present in the input. This measure turns out to function), e.g. those that satisfy the quadrangle inequal-
be very natural, and if the input comes from a function, it ity [1, 5, 2] or the inverse quadrangle inequality [1]. None of
corresponds to the number of inflection points of the input these algorithms produce the correct answer when g(·) does
function. An immediate consequence of the result is that not satisfy their assumptions. In this paper, we develop a
if the input comes from a polynomial of degree d, then the combination of these approaches that works for all inputs,
running time of the algorithm can be upper bounded by but has a running time that depends on the function g(·) in
a function of d. The new algorithm extends the previous a natural way.
results to a wider family of functions and is never worse
than either special cases. All logarithms in this paper are in base two.
Categories and Subject Descriptors 2. THE CONVEX AND CONCAVE CASE
F.2.2 [Theory of Computation]: Nonnumerical Algorithms In this section we describe the algorithm of [1] for the convex
and Problems case. We do not explicitly consider the concave case, since
it is essentially analogous.
General Terms
Consider the case when g(·) is a convex function, that is
Algorithms, Theory a function which increases at an increasing rate. In other
words, given a < b the following residues are implicitly
Keywords sorted:
Dynamic programming, edit distance g(b + c) − g(a + c) ≤ g(b + c ) − g(a + c ) for 0 ≤ c ≤ c .
1. INTRODUCTION Now consider the line 1, 2, ..., n. We will assign to each
point i on this line the value Y [i], i.e. the combination that
We consider the following simple computation: given a se- achieves the minimum for a given i in Eq. 1. Observe that
quence of values X1, ..., Xn and a function g(k) for 1 ≤ k ≤ we can represent Y [i] with Xk, since given some Xk and
n, compute for all 1 ≤ i ≤ n: i the choice of gap is implicit (i.e. g(i − k)). For each i
in the list we would like to find the Xk which achieves the
Yi = min1≤k
Lemma 2.1. If Xk achieves the minimum for some i, but
does not achieve the minimum for i + 1, then it does not
achieve the minimum for any i > i.
Proof. Let Xj be the element that achieves the mini-
m a t c o s -1 3 Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science 9
Koper, Slovenia, 10-11 October
Applications to Bioinformatics
[Extended Abstract]
Marko Grgurovicˇ
University of Primorska
Titov Trg 4
Koper, Slovenia
marko.grgurovic@student.upr.si
ABSTRACT It can be shown that certain dynamic programming algo-
rithms, including algorithms for the edit distance problem
In this paper we study a simple computation that lies at and algorithms used in geology [4] and in speech recogni-
the core of many dynamic programming algorithms, perhaps tion [3], can be reduced to this simple computation. The
most notably those for computing the edit distance between straightforward algorithm for computing Eq. 1 takes O(n)
two strings. Past solutions have assumed properties of the time per i, which amounts to a total of O(n2) time over all
input such as convexity and concavity and would not work i.
on general inputs. We obtain an algorithm that combines
the previous algorithms in a way that makes no assumptions Previous algorithms for this problem focused on specific
on the input, yet its running time depends on a measure of types of the function g(·) (usually referred to as the gap
“sortedness” present in the input. This measure turns out to function), e.g. those that satisfy the quadrangle inequal-
be very natural, and if the input comes from a function, it ity [1, 5, 2] or the inverse quadrangle inequality [1]. None of
corresponds to the number of inflection points of the input these algorithms produce the correct answer when g(·) does
function. An immediate consequence of the result is that not satisfy their assumptions. In this paper, we develop a
if the input comes from a polynomial of degree d, then the combination of these approaches that works for all inputs,
running time of the algorithm can be upper bounded by but has a running time that depends on the function g(·) in
a function of d. The new algorithm extends the previous a natural way.
results to a wider family of functions and is never worse
than either special cases. All logarithms in this paper are in base two.
Categories and Subject Descriptors 2. THE CONVEX AND CONCAVE CASE
F.2.2 [Theory of Computation]: Nonnumerical Algorithms In this section we describe the algorithm of [1] for the convex
and Problems case. We do not explicitly consider the concave case, since
it is essentially analogous.
General Terms
Consider the case when g(·) is a convex function, that is
Algorithms, Theory a function which increases at an increasing rate. In other
words, given a < b the following residues are implicitly
Keywords sorted:
Dynamic programming, edit distance g(b + c) − g(a + c) ≤ g(b + c ) − g(a + c ) for 0 ≤ c ≤ c .
1. INTRODUCTION Now consider the line 1, 2, ..., n. We will assign to each
point i on this line the value Y [i], i.e. the combination that
We consider the following simple computation: given a se- achieves the minimum for a given i in Eq. 1. Observe that
quence of values X1, ..., Xn and a function g(k) for 1 ≤ k ≤ we can represent Y [i] with Xk, since given some Xk and
n, compute for all 1 ≤ i ≤ n: i the choice of gap is implicit (i.e. g(i − k)). For each i
in the list we would like to find the Xk which achieves the
Yi = min1≤k
Lemma 2.1. If Xk achieves the minimum for some i, but
does not achieve the minimum for i + 1, then it does not
achieve the minimum for any i > i.
Proof. Let Xj be the element that achieves the mini-
m a t c o s -1 3 Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science 9
Koper, Slovenia, 10-11 October