Benchmarking

Running on your own cluster

Opaque supports a command-line interface for benchmarking against plaintext Spark. The following steps show you how to build and submit benchmarking jobs to a Spark cluster.

  1. Create the benchmarking data:

    build/sbt data
    
  2. Create a fat jar that contains both source and test classes:

    build/sbt test:assembly
    
  3. For usage and to see a list of available flags, specify --help to the benchmarking class:

    build/sbt 'test:runMain edu.berkeley.cs.rise.opaque.benchmark.Benchmark --help'
        # Available flags:
        # --num-partitions: specify the number of partitions the data should be split into.
            # Default: spark.default.parallelism
        # --size: specify the size of the dataset that should be loaded into Spark.
            # Default: sf_001
            # Supported values: sf_001, sf_01, sf_1
            # Note: sf_{scalefactor} indicates {scalefactor} * 1GB size datasets.
        # --filesystem-url: optional arguments to specify filesystem master node URL.
            # Default: file://
        # --log-operators: boolean whether or not to log individual physical operators.
            # Default: false
            # Note: may reduce performance if set to true (forces caching of
            # intermediate values).
        # --operations: select the different operations that should be benchmarked.
            # Default: all
            # Available operations: logistic-regression, tpc-h
            # Syntax: --operations logistic-regression,tpc-h
        # Leave --operations flag blank to run all benchmarks
    

Alternatively, you can look at Benchmark.scala

  1. Submit the job to Spark:

    spark-submit --class edu.berkeley.cs.rise.opaque.benchmark.Benchmark \
        <Spark configuration parameters> \
        ${OPAQUE_HOME}/target/scala-2.12/opaque-assembly-0.1.jar \
            <flags>
    

For more help on how to submit jobs to Spark, see Submitting applications. For a complete list of values possible in <Spark configuration parameters>, see Spark properties

Our TPC-H results

We used a 3 node cluster with 4 cores and 16GB memory per node.

  1. Our spark-defaults.conf:

    spark.driver.memory                3g
    spark.executor.memory              11g
    spark.executor.instances           3
    
    spark.default.parallelism          36
    spark.task.maxFailures             10
    
  2. The command we used to submit the benchmark:

    spark-submit --class edu.berkeley.cs.rise.opaque.benchmark.Benchmark
        --master spark://<master IP>:7077 \
        --deploy-mode client \
        ${OPAQUE_HOME}/target/scala-2.12/opaque-assembly-0.1.jar \
            --filesystem-url hdfs://<master IP>:9000 \
            --size sf_1 \
            --operations tpc-h \
    
  3. Final results:

TPC-H Query Results

Query Number

Insecure (ms)

Encrypted (ms)

Slow down factor

1

4239.776

221778.54

52.30902293

2

3714.381

42819.894

11.52813726

3

4025.312

44301.481

11.00572602

4

5280.407

51005.712

9.659428146

5

3952.555

72996.46

18.46817059

6

1725.161

17447.919

10.1137917

7

3805.931

98998.996

26.01176847

8

3935.96

76687.933

19.48392082

9

6445.108

114628.632

17.78537024

10

4248.211

37983.802

8.941128866

11

2132.784

32503.568

15.23997179

12

2491.55

24569.189

9.8610058

13

1846.523

42918.316

23.24277358

14

2156.878

12442.28

5.768652654

15

4194.705

29534.721

7.040953059

16

2715.897

21649.943

7.971562618

17

3378.273

122752.706

36.33593437

18

5397.52

230610.221

42.72521843

19

1854.364

10147.414

5.472180219

20

4439.338

25883.642

5.830518424

21

7403.973

241678.371

32.64171425

22

2137.743

16989.443

7.947373936