Accelerating Relational Database Analytical Processing with Bulk

Accelerating Relational Database Analytical

Processing with Bulk-Bitwise

Processing-in-Memory

Ben Perach Ronny Ronen Shahar Kvatinsky

Viterbi Faculty of Electrical & Computer Engineering

Technion – Israel Institute of Technology

Haifa, Israel

[email protected] ronny[email protected] [email protected]

Abstract—Online Analytical Processing (OLAP) for relational

databases is a business decision support application. The appli-

cation receives queries about the business database, usually re-

questing to summarize many database records, and produces few

results. Existing OLAP requires transferring a large amount of

data between the memory and the CPU, having a few operations

per datum, and producing a small output. Hence, OLAP is a good

candidate for processing-in-memory (PIM), where computation is

performed where the data is stored, thus accelerating applications

by reducing data movement between the memory and CPU. In

particular, bulk-bitwise PIM, where the memory array is a bit-

vector processing unit, seems a good match for OLAP. With

the extensive inherent parallelism and minimal data movement

of bulk-bitwise PIM, OLAP applications can process the entire

database in parallel in memory, transferring only the results to

the CPU. This paper shows a full stack adaptation of a bulk-

bitwise PIM, from compiling SQL to hardware implementation,

for supporting OLAP applications. Evaluating the Star Schema

Benchmark (SSB), bulk-bitwise PIM achieves a 4.65× speedup

over Monet-DB, a standard database system.

Index Terms—Processing-in-memory, Database, OLAP, Mem-

ristors

I. INTRODUCTION

Analytical processing of relational databases is a business

decision support application, allowing decision-makers to an-

alyze their business data [1]. Often, these analyses require

summarizing large sections of the database, taking a long

execution time [2], mostly on transferring data from the

memory to the CPU. Summarizing large sets of data to few

results hints that this application can beneﬁt from processing-

in-memory (PIM) techniques.

PIM techniques come in various types and technologies [3–

10], all operate on data at (or close to) where it is stored, i.e.,

the memory. In this paper, we are interested in accelerating an-

alytical processing of relational databases with a speciﬁc PIM

technique called bulk-bitwise PIM [3–7]. Bulk-bitwise PIM

is characterized by utilizing the memory cell arrays to both

process the data and directly store the result. Thus, computing

with bulk-bitwise PIM can minimize data movement. As data-

intensive applications invest most of their time and energy

in memory access [11–13], our approach for accelerating

This work was supported by the European Research Council through the

European Union’s Horizon 2020 Research and Innovation Programme under

Grant 757259 and through the European Union’s Horizon Europe Research

and Innovation Programme under Grant 101069336.

analytical processing consists of reducing the required memory

accesses by the host. We do so by leveraging the in-place

processing of bulk-bitwise PIM to distill the information stored

in memory, transferring as little information as possible to the

host. Data transfer reduction is possible because databases can

be transferred into PIM memory once and used many times.

Note that using bulk-bitwise PIM in this way is orthogonal

to the host processor type (e.g., CPU, GPU, FPGA) since any

host requires to get the information from the memory. Further-

more, using bulk-bitwise PIM to reduce the data movement

out of the memory cell arrays is complementary to other PIM

techniques where dedicated processing units are placed close

to the memory arrays. These dedicated processing units can

act as the hosts for the bulk-bitwise PIM technique, proﬁting

from the same reduced data transfer.

In this paper, we combine the thread from several of our

previous works [3, 14, 15] to show how analytical processing

for relational databases can be supported with bulk-bitwise

PIM. As the processing capabilities of bulk-bitwise PIM

depend on the data arrangement within the memory arrays,

we show how to arrange the database relations within the

memory arrays, creating a dedicated data structure for bulk-

bitwise PIM. We then identify the basic primitive of bulk-

bitwise PIM that can reduce data transfer with this new data

structure (i.e., ﬁlter and aggregate). Afterward, we show how

more complex database operations are supported for the star

schema database [1,16] (i.e., GROUP-BY and JOIN), allowing

execution of full queries. Using this support, we can execute

the full Star Schema Benchmark (SSB) [16]. We evaluate these

techniques with a memristive bulk-bitwise PIM design [3, 14]

using the gem5 [17] simulation environment.

II. BACKGROUND

A. Relational Databases and Analytical Processing

The relational database is a data model organized into one

or more relations (tables). Each relation is constructed as

multiple records and attributes (shown in Fig. 1a), represented

by the relation’s rows and columns, respectively. Records are

independent, holding information belonging to a single item

with a single value for each attribute. Each relation has an

attribute (or a set of attributes) that uniquely identiﬁes the

arXiv:2307.00658v1 [cs.DB] 2 Jul 2023

(a)

Attribute1 Attribute2 Attribute3

Record1

Record2

Record3

Record4

(b)

Memory Cell

Fig. 1: (a) A database relation. (b) The embedding data structure within a

memory cell array.

Fact Relation

Dim1 Key Dim2 KeyAttr

Dimension1 Relation

Key Attr

Dimension2 Relation

Key Attr

Fig. 2: A star schema illustration.

records in the relation and is called the key of that relation.

When a relation has an attribute that has values from a key of

another relation, this attribute is called a foreign key.

Queries on the database are questions about the data held in

the database. In analytical processing, queries require ﬁnding

all records fulﬁlling certain conditions on their attributes and

summarizing (e.g., sum, average, max) one or more attributes

across the found records [1,2]. This summarizing can also be

requested per subgroups of the found records, where subgroups

are deﬁned according to unique values of some attributes. This

division for subgroups is called a GROUP-BY operation.

When a query includes conditions involving attributes from

multiple relations, records from the different relations are

matched according to these conditions. The operation of

matching records is called a JOIN. When the condition be-

tween the relations’ attributes for JOIN is equality, the JOIN

is named equi-JOIN. For analytical processing queries, JOIN

operations are usually the most time-consuming part of query

execution [18].

A common database structure for analytical processing is

the star schema [1, 16]. An illustration of the star schema

structure is shown in Fig. 2. In this schema, there is a

single large relation, called the fact relation, and several small

relations called the dimension relations. The fact relation has

a foreign key to each of the dimension relations. In the star

schema, JOIN operations required by queries are, by and

large, only equi-JOIN between a dimension relation’s key

and its respective fact relation foreign key [1]. All the SSB

benchmark [16] queries use only this kind of JOIN.

B. Bulk-Bitwise PIM

Bulk-bitwise PIM uses the memory cell arrays as processing

units, which can be implemented with DRAM or emerging

nonvolatile memory technologies [3–7]. Because of the regular

structure of these arrays, the supported operations are bitwise

logic operations (e.g., AND, NOT, NOR) between array rows

or columns (shown in Fig. 3). When multiple memory cell

arrays operate concurrently, the effective operation is a very

wide bitwise operation, i.e., bulk-bitwise operations. As such,

bulk-bitwise PIM can process data where it is stored and

exhibit high computational bandwidth.

To support virtual memory, bulk-bitwise PIM operations

are restricted to use and rewrite data within a single virtual

page [3], usually a huge page. This way, when a program sends

Memory

Cell

Bitwise

Operation

Input

Columns

Ouput

Column

Fig. 3: A 3 × 4 memory array with a column-wise logic operation.

a PIM operation to memory with a virtual address, the virtual

address can be translated in the standard fashion and the PIM

operation is routed by the hardware to its designated place.

To perform the same operation on several pages, however,

the same operation has to be sent to each page separately. In

addition, since the PIM computation is tightly coupled with

the layout of data in the memory cell arrays, data for PIM has

to be structured in a speciﬁc, dedicated way. To allow software

in virtual memory ﬁne-grain control over the layout of data

structures within the cell arrays, the mapping of addresses to

cell array location is part of the bulk-bitwise programming

model [3]. The speciﬁed mapping is on the page offset bits of

the virtual address since they do not change on virtual-physical

address translation, giving the virtual space software control

over these bits.

To guarantee program correctness, the ordering rules of PIM

operations with the standard memory operations (e.g., loads,

stores) have to be deﬁned [14]. Having well-deﬁned ordering

rules for bulk-bitwise PIM requires the PIM memory and host

caches to be coherent, supported by the host hardware [14].

III. THE RELATION DATA STRUCTURE

A database relation is stored in a memory cell array, as

shown in Fig. 1. It is necessary to store data in a certain way

so it can be directly processed without further movement since

the processing capabilities of bulk-bitwise PIM are directly

connected to the layout of data within the memory cell array.

To this end, each record is set across a cell array row, and each

attribute spans several columns. In this data structure, column-

wise operations can be used to process all records in parallel.

Some cell array columns, however, have to remain empty to

store the PIM results before they are read from the memory.

Note that a record’s bytes are not consecutive in virtual address

space due to the address mapping.

If a single array row is not enough to hold all the attributes

of a relation, the relation’s attributes must be split on more

than a single cell array. These additional cell arrays are put

on different pages and might require moving data between the

pages using standard loads and stores.

IV. BULK-BITWISE PIM PRIMITIVES

This section presents the basic database operations using

the relation data structure and bulk-bitwise PIM capabilities.

These database primitives reduce the required data transfer

between the host and memory.

A. Filter

A ﬁlter operation, shown in Fig. 4a, checks a condition

across all relation records. This operation ﬁlters all records

for a query or a single subgroup in a GROUP-BY operation

(a)

Memory Cell

Condition on attributes

(b)

Aggregate attribute

Fig. 4: Illustration of PIM primitives: (a) ﬁlter and (b) aggregation. The results

are marked with red cells.

(Section V-B). The condition checking is done with PIM,

where the result is a single bit per record. The resulting bit is

stored in the same cell array column across all cell arrays of the

relation. Hence, to assert which record passed the condition,

the host only needs to read a single bit per record instead of the

condition’s attributes per record. The data transfer reduction

depends on the number of attributes in the condition, the

attributes’ lengths, and the data itself (as non-PIM techniques

are data depended [2]).

To support the ﬁlter primitive, the PIM module instruc-

tion set includes comparison operations (e.g., equality, less-

than), logic operations (e.g., AND, OR, NOT), and arithmetic

operations (e.g., addition, multiplication). These operations

are supported for both an attribute with an attribute and an

attribute with an immediate. Additionally, these operations are

supported for multiple attribute lengths.

B. Aggregation

The aggregation operation, shown in Fig. 4b, reduces a

speciﬁc relation attribute, across all of the relations, to a single

value (e.g., sum, average, max). This reduction, however,

includes values only from selected records. This selection is

done by ﬁrst ﬁltering the records according to the condition

(as in the ﬁlter primitive in Section IV-A). Instead of reading

the ﬁlter result, it is used to generate a masked version of

the attribute to be aggregated (without overwriting the original

attribute). The mask operation nulliﬁes the non-selected values

so they will not affect the aggregation operation. The masked

attribute is then aggregated, according to the desired operation,

within each cell array, resulting in a single value per cell array.

Afterward, the host reads the single value from each cell array

and aggregates them to complete the operation. Hence, the host

only has to read a single value per cell array per aggregation.

For example, when summing an attribute, the attribute’s

bits are ANDed with the ﬁlter result bit to create the masked

attribute. Hence, the unselected records have a zero value in

the masked attribute, while the selected records’ values remain

the same. Summing the masked attribute will have the same

result as summing only the attribute for the selected records.

The host reads the single values in each cell array and sums

them together, producing the ﬁnal result.

Since the aggregation operations are performed value by

value, they must be commutative and associative. Hence, the

PIM module supports the aggregation operation for sum, min,

and max within each cell array. Other aggregation operations

can be supported as a combination of commutative and as-

sociative aggregation. For example, to perform an average (a

non-associative operation) on an attribute, the attribute is ﬁrst

(a) (b)

Fig. 5: Primitive operation speedup and data transfer reduction (last-level

cache misses) on the TPC-H benchmark. (a) Filter-only and (b) full queries.

summed. Then, the ﬁlter result is also aggregated using sum,

resulting in the count of the selected records. The host then

divides the total sum of the records by their count to produce

the average.

C. Evaluating PIM Primitives

To evaluate the performance of the PIM primitives, we

used the TPC-H benchmark [19], a relation database analytical

processing benchmark. The baseline compared to is an in-

house implementation of an in-memory database [2], where

the entire database is stored in DRAM main memory. The

baseline ran on the same system as with PIM and executed

the same query section as the PIM system. A full evaluation

description and more analysis are presented in [3].

The speedup and memory accesses reduction (LLC misses)

for the PIM over the baseline are shown in Fig. 5. The queries

are divided into two groups, full queries and ﬁlter-only queries.

Full queries can perform the entire query, including the

required aggregations, while ﬁlter-only queries can perform

only the ﬁlter part of the query. Since the aggregation performs

a more substantial data transfer reduction, it also achieves a

signiﬁcantly higher speedup. We also see that the data transfer

reduction is similar in magnitude to the speedup, supporting

our approach that bulk-bitwise PIM speedup comes from the

data transfer reduction.

V. SUPPORTING THE STAR SCHEMA

Using the PIM database primitives and understanding their

strengths and weaknesses, more complex operations can be

designed and supported. This section presents the support for

JOIN and GROUP-BY for the star schema. Full details and

full evaluation are presented in [15].

A. Supporting JOIN

JOIN requires matching records from different relations,

where the matching itself is data-dependent. Hence, perform-

ing JOIN requires many data movements, which conﬂicts

with the goal of data movement reduction. To avoid this,

JOIN is accomplished by pre-computing the required JOIN

output and storing the relations as a single JOINed relation.

Pre-computing a JOIN operation is a known technique to

accelerate query execution, appearing as denormalization [20]

or materialized views [21]. Storing a pre-computed JOIN,

however, has drawbacks. Characteristics of the star schema

and bulk-bitwise PIM can mitigate these drawbacks.

A signiﬁcant drawback of pre-computed JOINs is their

ﬂexibility. If a query requires a different JOIN than the one

pre-computed, then the pre-computation is not helpful, and

Fig. 6: Execution time relative to one xb for the Star Schema Benchmark (SSB) benchmark queries.

the required JOIN has to be performed. For the star schema,

most JOIN operations are an equi-JOIN between a dimension

relation and the fact relation on the dimension key [1, 16].

All of the JOIN operations of the SSB [16] benchmark are of

this type. Hence, we cover and accelerate the vast majority of

cases by storing the fact relation equi-JOINed to all dimension

relations on their key.

Using pre-computed JOINs also complicates mainte-

nance [20,21]. Since JOIN operations duplicate data, perform-

ing an UPDATE on a pre-computed JOIN requires changing

many entries. Using bulk-bitwise PIM, however, alleviates this

drawback. The ﬁlter primitive can efﬁciently ﬁnd all records to

update. Furthermore, PIM operations can perform the update

itself, not requiring reading any value from PIM memory (i.e.,

implementing a PIM MUX with the ﬁlter result as select [15]).

B. Supporting GROUP-BY

Performing GROUP-BY can be simply done by using PIM

aggregation for every subgroup, aggregating subgroups one

by one. The number of subgroups, however, can be large.

Performing many aggregations may result in long latency, high

energy, and high power, and reduce the lifetime of the PIM

module due to the limited endurance of memory cells [3].

To mitigate these deﬁciencies of PIM aggregation, a stan-

dard CMOS logic circuit is added to the memory cell array

peripherals. The circuit performs the aggregation by reading

the masked attribute to be aggregate value by value from its

cell array. The circuit aggregates the read values and writes

the ﬁnal result back to the cell array. Thus, the latency, energy,

and power of the aggregation primitive are reduced compared

to a pure bulk-bitwise PIM aggregation. Furthermore, the PIM

module lifetime is increased, as cells are written only at the

end of the aggregation operation.

Nevertheless, the latency of the entire GROUP-BY opera-

tion can still be high if there are many subgroups. We note that

aggregation can be performed in another way. The records for

the query (for all subgroups) can be ﬁltered, then read record-

by-record by the host. The host then classiﬁes and aggregates

the records in their subgroups. This host aggregation technique

is dependent on the number of records in the query and is

independent of the number of subgroups. PIM aggregation,

however, depends on the number of subgroups and not the

number of records in the query. Hence, it is beneﬁcial to

have PIM aggregating a few large subgroups, leaving many

small subgroups to host aggregation, exploiting the strength

of each method. To divide the subgroups between PIM and

host aggregation, a performance model is used along with an

estimate for subgroups’ sizes [15]. This estimate is done by

taking a small sample from the database. This GROUP-BY

technique is adapted from an in-cloud processing work [22].

C. Star Schema Evaluation

We evaluate our solutions on the SSB benchmark [16].

The execution times are shown in Fig. 6. We compared three

versions of PIM. one-xb stores records of the JOINed relation

in a single memory cell array. two-xb is where the JOINed

relation’s records are vertically split into two memory cell

arrays, having the fact relation’s attributes in one array and all

of the dimension relations’ attributes in another array. pimdb

is the same as one-xb, only the PIM aggregation (PIM-gb)

is performed with pure bulk-bitwise operations (without the

added aggregation circuit). Two baselines are compared, mnt-

reg and mnt-join. These baselines run Monet-DB [23], a real

in-memory database system running on real hardware. The

mnt-reg and mnt-join hold the SSB database in its standard

and pre-computed JOIN, respectively.

The best execution time is for one-xb, having a geo-mean

speedup of 7.46× and 4.65× over mnt-reg and mnt-join,

respectively. two-xb is 3.39× slower (in geo-mean) than one-

xb since many data transfers are required between the memory

cell arrays of the relation. two-xb, however, still has 1.37×

speedup on mnt-join. The cases where a non-PIM execution is

faster than a PIM execution (Q2.1, Q3.1, and Q4.1) are where

the PIM execution does not achieve a signiﬁcant data transfer

reduction. In these cases, the number of records required

by the query is high. Due to the data structure of the PIM

relation, most of the relation’s records are transferred to the

host, resulting in little to no data reduction and removing the

advantage of PIM. See [15] for more details and discussions.

VI. CONCLUSIONS

This paper shows how to support analytical processing of

relational databases using bulk-bitwise PIM. Our bulk-bitwise

PIM technique aimed to reduce the required data transfers. By

substituting serial data accesses to memory with very wide and

short operations within the memory, we achieve a signiﬁcant

speedup over von Neumann machines.

We ﬁrst designed a data structure suited for bulk-bitwise

PIM. Then, we identiﬁed and supported primitive operations

(ﬁlter and aggregate), performing relevant functionality and

reducing data transfer. These primitives were evaluated and

studied. Based on these primitives, we support more com-

plex operations (JOIN and GROUP-BY) and evaluated a full

database benchmark (SSB) for a system based on memristive

bulk-bitwise PIM. We believe this work will inspire other

research for further adaptation of applications for bulk-bitwise

PIM.

REFERENCES

[1] R. Kimball and M. Ross, The Data Warehouse Toolkit: The Deﬁnitive

Guide to Dimensional Modeling, 3rd ed. Wiley Publishing, 2013.

[2] H. Plattner, A Course in In-Memory Data Management: The Inner

Mechanics of In-Memory Databases. Springer Publishing Company,

Incorporated, 2013.

[3] B. Perach, R. Ronen, B. Kimelfeld, and S. Kvatinsky, “PIMDB:

Understanding Bulk-Bitwise Processing In-Memory Through Database

Analytics,” 2022. [Online]. Available: https://arxiv.org/abs/2203.10486

[4] V. Seshadri, D. Lee, T. Mullins, H. Hassan, A. Boroumand, J. Kim,

M. A. Kozuch, O. Mutlu, P. B. Gibbons, and T. C. Mowry, “Ambit: In-

memory accelerator for bulk bitwise operations using commodity dram

technology,” in 2017 50th Annual IEEE/ACM International Symposium

on Microarchitecture (MICRO), 2017, pp. 273–287.

[5] N. Hajinazar, G. F. Oliveira, S. Gregorio, J. a. D. Ferreira, N. M. Ghiasi,

M. Patel, M. Alser, S. Ghose, J. G

omez-Luna, and O. Mutlu, “Simdram:

A framework for bit-serial simd processing using dram,” in Proceedings

of the 26th ACM International Conference on Architectural Support for

Programming Languages and Operating Systems, ser. ASPLOS ’21.

New York, NY, USA: Association for Computing Machinery, 2021, p.

329–345. [Online]. Available: https://doi.org/10.1145/3445814.3446749

[6] S. Resch, S. K. Khatamifard, Z. I. Chowdhury, M. Zabihi, Z. Zhao,

H. Cilasun, J. P. Wang, S. S. Sapatnekar, and U. R. Karpuzcu, “MOUSE:

Inference In Non-volatile Memory for Energy Harvesting Applications,”

in 2020 53rd Annual IEEE/ACM International Symposium on Microar-

chitecture (MICRO), 2020, pp. 400–414.

[7] S. Li, C. Xu, Q. Zou, J. Zhao, Y. Lu, and Y. Xie, “Pinatubo: A

processing-in-memory architecture for bulk bitwise operations in emerg-

ing non-volatile memories,” in 2016 53nd ACM/EDAC/IEEE Design

Automation Conference (DAC), 2016, pp. 1–6.

[8] V. Seshadri, Y. Kim, C. Fallin, D. Lee, R. Ausavarungnirun, G. Pekhi-

menko, Y. Luo, O. Mutlu, P. B. Gibbons, M. A. Kozuch, and T. C.

Mowry, “RowClone: Fast and energy-efﬁcient in-DRAM bulk data

copy and initialization,” in 2013 46th Annual IEEE/ACM International

Symposium on Microarchitecture (MICRO), 2013, pp. 185–197.

[9] J. H. Kim, S.-h. Kang, S. Lee, H. Kim, W. Song, Y. Ro, S. Lee, D. Wang,

H. Shin, B. Phuah, J. Choi, J. So, Y. Cho, J. Song, J. Choi, J. Cho,

K. Sohn, Y. Sohn, K. Park, and N. S. Kim, “Aquabolt-xl: Samsung

hbm2-pim with in-memory processing for ml accelerators and beyond,”

in 2021 IEEE Hot Chips 33 Symposium (HCS), 2021, pp. 1–26.

[10] J. Reuben, R. Ben-Hur, N. Wald, N. Talati, A. H. Ali, P.-E. Gaillardon,

and S. Kvatinsky, “Memristive logic: A framework for evaluation and

comparison,” in 2017 27th International Symposium on Power and

Timing Modeling, Optimization and Simulation (PATMOS), 2017, pp.

1–8.

[11] W. A. Wulf and S. A. McKee, “Hitting the memory wall:

Implications of the obvious,” ACM SIGARCH Computer Architecture

News, vol. 23, no. 1, p. 20–24, 1995. [Online]. Available: https:

//doi.org/10.1145/216585.216588

[12] J. Langguth, X. Cai, and M. Sourouri, “Memory bandwidth contention:

Communication vs computation tradeoffs in supercomputers with mul-

ticore architectures,” in 2018 IEEE 24th International Conference on

Parallel and Distributed Systems (ICPADS), 2018, pp. 497–506.

[13] A. Pedram, S. Richardson, M. Horowitz, S. Galal, and S. Kvatinsky,

“Dark memory and accelerator-rich system optimization in the dark

silicon era,” IEEE Design & Test, vol. 34, no. 2, pp. 39–50, 2017.

[14] B. Perach, R. Ronen, and S. Kvatinsky, “On consistency for bulk-

bitwise processing-in-memory,” in 2023 IEEE International Symposium

on High-Performance Computer Architecture (HPCA), 2023, pp. 705–

717.

[15] ——, “Enabling relational database analytical processing in bulk-

bitwise processing-in-memory,” 2023. [Online]. Available: https:

//arxiv.org/abs/2302.01675

[16] P. O’Neil et al., The Star Schema Benchmark and Augmented Fact

Table Indexing. Berlin, Heidelberg: Springer-Verlag, 2009, p. 237–252.

[Online]. Available: https://doi.org/10.1007/978-3-642-10424-4 17

[17] N. Binkert et al., “The gem5 simulator,” SIGARCH Comput. Archit.

News, vol. 39, no. 2, p. 1–7, August 2011.

[18] M. Dreseler, M. Boissier, T. Rabl, and M. Uﬂacker, “Quantifying

tpc-h choke points and their optimizations,” Proc. VLDB Endow.,

vol. 13, no. 8, p. 1206–1220, 2020. [Online]. Available: https:

//doi.org/10.14778/3389133.3389138

[19] “TPC benchmark H standard speciﬁcation revision 3.0.0,” http://tpc.org/

tpch/, Transaction Processing Performance Council, 2021.

[20] S. K. Shin and G. L. Sanders, “Denormalization Strategies for Data

Retrieval from Data Warehouses,” Decis. Support Syst., vol. 42, no. 1,

p. 267–282, 2006.

[21] R. Chirkova and J. Yang, “Materialized Views,” Foundations and

Trends® in Databases, vol. 4, no. 4, pp. 295–405, 2012.

[22] X. Yu et al., “PushdownDB: Accelerating a DBMS Using S3 Compu-

tation,” in ICDE-36, 2020.

[23] S. Idreos et al., “MonetDB: Two Decades of Research in Column-

oriented Database Architectures,” IEEE Data Eng. Bull., vol. 35, no. 1,

pp. 40–45, 2012.