Page 17 - Krész, Miklós, and Andrej Brodnik (eds.). MATCOS-13. Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science. Koper: University of Primorska Press, 2016.
P. 17
parison between a cache-oblivious range query data
structure and a quadtree
Tine Šukljan
University of Primorska
Institute Andrej Marušicˇ
Muzejski trg 2
Koper, Slovenia
tine.sukljan@upr.si
ABSTRACT multi-level hierarchy, mostly because many parameters are
used in such models to describe the memory levels. Until
After the introduction of the cache-oblivious model by Frigo the introduction of the cache-oblivious model by Frigo et
et al. in 1999, a lot of research has been done about data al. in 1999[6] that allows obtaining algorithms that are ef-
structures that work well in multi-level memory hierarchies. ficient in multi-level memory hierarchy without the use of
Range reporting, as one of the fundamental problems in complicated models.
computational geometry, received a lot of attention. As
there was a lot of research done in this area, there are Range reporting is one of the most studied problems in com-
very little implementation and comparisons between cache- putational geometry. Given a set S in Rd, the problem is to
oblivious and ”non”-cache-oblivious data structures. preprocess S, so that for a given query range q, the points in
S ∩q are reported quickly. There exist many types of queries
In this article we implemented one of the cache-oblivious as axis-aligned boxes, circles, halfspaces. If R2 there exists
data structures for range queries and compared it with the special cases like the two-sided and three-sided range report-
quadtree. The results show, that the cache-oblivious data ing. In the case of the two-sided range reporting, it consists
structure answers the queries much faster, but at the ex- of an axis-aligned box with two adjacent boundaries fixed
pense of bigger space consumption. at ∞ (or −∞ respectively). Similarly, a three-sided range
reporting consists of a axis-aligned box with one boundary
Keywords fixed at ∞ (or −∞ respectively). A four-sided query is also
called orthogonal.
cache-oblivious, range queries, quadtree
In this paper we try to make a comparison of data structures
1. INTRODUCTION for orthogonal range queries. Specifically between a classi-
cal data structure for range query which was developed for
The memory systems in modern computers normally con- the RAM model and a data structure created in the cache-
sist of several levels in a hierarchy, typically of cache, main oblivious model.
memory and disk. The access time of different levels vary
by orders of magnitude. To improve the running time of 1.1 Related work
accessing the data far from the processor, the data are often
moved from the farthest levels to the close ones in big blocks. Much of research has been done about cache-oblivious data
It is important to design the algorithms and data structures
to support high locality in the memory-access patterns. structures for range reporting, range counting and domi-
When working in RAM model, one assumes a flat mem- nance in various dimension after the introduction of the
ory system with uniform access time. Because of that most
of the algorithms and data structures exhibit low memory- cache-oblivious model. Most notable were the cache-oblivious
access locality and are not efficient in a hierarchical memory B-Trees structures[5] with O(logB N ) search and update time.
system. A lot of work has been done in a two-level mem- All the structures rely intensely on the so-called van-Emde-
ory model (I/O model) introduced by Agarwal and Vitter
in 1988[3] to model the high difference in access time be- Boas layout for storing a balanced constant-degree tree of
tween memory and disks. Much less work has been done in size O(N ) in memory so that any root-to-leaf path can
be traversed cache-obliviously in O(logB N ) memory trans-
fers[7].
Agarwal et al.[2] were the first to develop a cache-oblivious
version of a two-dimensional range tree that answers pla-
nar range queries in the optimal O(logB N + T /B) memory
using O(log22 N )
transfers of the form B = space. The downside was that B
has to be 22c for some non-negative integer
constant c. Later, Arge et al.[4] developed a static cache-
oblivious linear-space data structure for answering two-sided
queries in O(logB N + T /B). Using a simple construction
method, we can also obtain a static cache-oblivious data
m a t c o s -1 3 Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science 17
Koper, Slovenia, 10-11 October
structure and a quadtree
Tine Šukljan
University of Primorska
Institute Andrej Marušicˇ
Muzejski trg 2
Koper, Slovenia
tine.sukljan@upr.si
ABSTRACT multi-level hierarchy, mostly because many parameters are
used in such models to describe the memory levels. Until
After the introduction of the cache-oblivious model by Frigo the introduction of the cache-oblivious model by Frigo et
et al. in 1999, a lot of research has been done about data al. in 1999[6] that allows obtaining algorithms that are ef-
structures that work well in multi-level memory hierarchies. ficient in multi-level memory hierarchy without the use of
Range reporting, as one of the fundamental problems in complicated models.
computational geometry, received a lot of attention. As
there was a lot of research done in this area, there are Range reporting is one of the most studied problems in com-
very little implementation and comparisons between cache- putational geometry. Given a set S in Rd, the problem is to
oblivious and ”non”-cache-oblivious data structures. preprocess S, so that for a given query range q, the points in
S ∩q are reported quickly. There exist many types of queries
In this article we implemented one of the cache-oblivious as axis-aligned boxes, circles, halfspaces. If R2 there exists
data structures for range queries and compared it with the special cases like the two-sided and three-sided range report-
quadtree. The results show, that the cache-oblivious data ing. In the case of the two-sided range reporting, it consists
structure answers the queries much faster, but at the ex- of an axis-aligned box with two adjacent boundaries fixed
pense of bigger space consumption. at ∞ (or −∞ respectively). Similarly, a three-sided range
reporting consists of a axis-aligned box with one boundary
Keywords fixed at ∞ (or −∞ respectively). A four-sided query is also
called orthogonal.
cache-oblivious, range queries, quadtree
In this paper we try to make a comparison of data structures
1. INTRODUCTION for orthogonal range queries. Specifically between a classi-
cal data structure for range query which was developed for
The memory systems in modern computers normally con- the RAM model and a data structure created in the cache-
sist of several levels in a hierarchy, typically of cache, main oblivious model.
memory and disk. The access time of different levels vary
by orders of magnitude. To improve the running time of 1.1 Related work
accessing the data far from the processor, the data are often
moved from the farthest levels to the close ones in big blocks. Much of research has been done about cache-oblivious data
It is important to design the algorithms and data structures
to support high locality in the memory-access patterns. structures for range reporting, range counting and domi-
When working in RAM model, one assumes a flat mem- nance in various dimension after the introduction of the
ory system with uniform access time. Because of that most
of the algorithms and data structures exhibit low memory- cache-oblivious model. Most notable were the cache-oblivious
access locality and are not efficient in a hierarchical memory B-Trees structures[5] with O(logB N ) search and update time.
system. A lot of work has been done in a two-level mem- All the structures rely intensely on the so-called van-Emde-
ory model (I/O model) introduced by Agarwal and Vitter
in 1988[3] to model the high difference in access time be- Boas layout for storing a balanced constant-degree tree of
tween memory and disks. Much less work has been done in size O(N ) in memory so that any root-to-leaf path can
be traversed cache-obliviously in O(logB N ) memory trans-
fers[7].
Agarwal et al.[2] were the first to develop a cache-oblivious
version of a two-dimensional range tree that answers pla-
nar range queries in the optimal O(logB N + T /B) memory
using O(log22 N )
transfers of the form B = space. The downside was that B
has to be 22c for some non-negative integer
constant c. Later, Arge et al.[4] developed a static cache-
oblivious linear-space data structure for answering two-sided
queries in O(logB N + T /B). Using a simple construction
method, we can also obtain a static cache-oblivious data
m a t c o s -1 3 Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science 17
Koper, Slovenia, 10-11 October