Benchmarking#

Running on your own cluster#

Opaque supports a command-line interface for benchmarking against plaintext Spark. The following steps show you how to build and submit benchmarking jobs to a Spark cluster.

Create the benchmarking data:
```
build/sbt data
```
Create a fat jar that contains both source and test classes:
```
build/sbt test:assembly
```

For usage and to see a list of available flags, specify --help to the benchmarking class:

build/sbt 'test:runMain edu.berkeley.cs.rise.opaque.benchmark.Benchmark --help'
    # Available flags:
    # --num-partitions: specify the number of partitions the data should be split into.
        # Default: spark.default.parallelism
    # --size: specify the size of the dataset that should be loaded into Spark.
        # Default: sf_001
        # Supported values: sf_001, sf_01, sf_1
        # Note: sf_{scalefactor} indicates {scalefactor} * 1GB size datasets.
    # --filesystem-url: optional arguments to specify filesystem master node URL.
        # Default: file://
    # --log-operators: boolean whether or not to log individual physical operators.
        # Default: false
        # Note: may reduce performance if set to true (forces caching of
        # intermediate values).
    # --operations: select the different operations that should be benchmarked.
        # Default: all
        # Available operations: logistic-regression, tpc-h
        # Syntax: --operations logistic-regression,tpc-h
    # Leave --operations flag blank to run all benchmarks

Alternatively, you can look at Benchmark.scala

Submit the job to Spark:

spark-submit --class edu.berkeley.cs.rise.opaque.benchmark.Benchmark \
    <Spark configuration parameters> \
    ${OPAQUE_HOME}/target/scala-2.12/opaque-assembly-0.1.jar \
        <flags>

For more help on how to submit jobs to Spark, see Submitting applications. For a complete list of values possible in <Spark configuration parameters>, see Spark properties

Our TPC-H results#

We used a 3 node cluster with 4 cores and 16GB memory per node.

Our spark-defaults.conf:

spark.driver.memory                3g
spark.executor.memory              11g
spark.executor.instances           3

spark.default.parallelism          36
spark.task.maxFailures             10

The command we used to submit the benchmark:

spark-submit --class edu.berkeley.cs.rise.opaque.benchmark.Benchmark
    --master spark://<master IP>:7077 \
    --deploy-mode client \
    ${OPAQUE_HOME}/target/scala-2.12/opaque-assembly-0.1.jar \
        --filesystem-url hdfs://<master IP>:9000 \
        --size sf_1 \
        --operations tpc-h \

Final results:

TPC-H Query Results#
Query Number	Insecure (ms)	Encrypted (ms)	Slow down factor
1	4239.776	221778.54	52.30902293
2	3714.381	42819.894	11.52813726
3	4025.312	44301.481	11.00572602
4	5280.407	51005.712	9.659428146
5	3952.555	72996.46	18.46817059
6	1725.161	17447.919	10.1137917
7	3805.931	98998.996	26.01176847
8	3935.96	76687.933	19.48392082
9	6445.108	114628.632	17.78537024
10	4248.211	37983.802	8.941128866
11	2132.784	32503.568	15.23997179
12	2491.55	24569.189	9.8610058
13	1846.523	42918.316	23.24277358
14	2156.878	12442.28	5.768652654
15	4194.705	29534.721	7.040953059
16	2715.897	21649.943	7.971562618
17	3378.273	122752.706	36.33593437
18	5397.52	230610.221	42.72521843
19	1854.364	10147.414	5.472180219
20	4439.338	25883.642	5.830518424
21	7403.973	241678.371	32.64171425
22	2137.743	16989.443	7.947373936