Don't Publish Database Benchmark Results
This post is part of a series about benchmarking the performance of relational databases. Another post described benchmarking in terms of the pros and cons, equipment and software, time and cost.
Benchmarking database performance is often for internal use only. You cannot publish the results because commercial database licenses include the infamous "DeWitt Clause" which clearly states you may not publish database performance numbers without prior written consent. This is true for all big databases like IBM, Microsoft, and Oracle. Even if you downloaded a free trial version of their software, you'll find the DeWitt Clause in the End User License Agreement (the wordy thing you agreed to before downloading the software).
Open source licenses generally have no such clause and do
not prevent publishing performance numbers.
However, you are still bound by the rules set by the makers of the
benchmark and tools. You cannot use
their trademarks without written consent.
Please check with all involved parties before posting performance data
on the Internet.
Database licensing agreements cannot restrict the publishing
of system metrics like CPU utilization, network utilization, storage bandwidth,
IOPS, and latency. Server and storage
vendors frequently publish such metrics in white papers and marketing
materials. For example, after a
TPC-H-like test you can freely share your server's bandwidth and latency
information, but query run times are restricted by the DBMS license.
You can do an official run of a TPC benchmark and publish
the results, but it is expensive and complex.
An official test has hundreds of rules and requires an approved 3rd
party auditor (there is just one approved auditor per benchmark). Several folks hired companies who specialize
in running official TPC benchmarks rather than trying to do it themselves,
because it really is quite complex. The
other big problem is trying to get enough resources to conduct a
benchmark. Tests often involve hundreds
or even thousands of servers.
If you're serious about testing and publishing results, then
start by reading the rules published by the Transaction Processing Performance
Council (TPC). You'll probably need to
hire a company to do the testing for you, and they will need to hire one of the
3rd party auditors pre-approved by the TPC. See http://tpc.org/information/who/whoweare5.asp
for a list of approved auditors.
As noted earlier, publishing database performance requires
written authorization from any commercial database vendor prior to publishing
results. The DBMS vendor can require you
provide your completed test results for review prior to receiving permission,
so you have to spend all of that time and money testing with no guarantee that
you'll be able to publish. Allegedly,
Oracle withheld permission from Dell to publish TPC test results for several
years until the results were obsolete because, again allegedly, Dell's system
had beaten Oracle Exadata and Oracle didn't want that made public until after
they had a chance to upgrade Exadata and beat Dell's numbers.
And finally, watch your Ps and Qs. The TPC isn't afraid to call-out offenders
including big name players and its own Council members. They publicly called out NVIDIA in January
2021 (ref)
for allegedly circumventing constraints built into the TPCx-BB benchmark to
inflate test results. The TPC fined
Oracle $10,000 in 2009 (ref)
for making claims involving TPC benchmarks without a formal publishing of the
results. The TPC previously called out
Oracle in 1993 (ref)
for adding a special option to their database specifically intended to inflate
test results.
With all of these things in mind, let's just do benchmarking
for our own internal purposes and forget about publishing. If you do need to publish something on the
Internet, please limit it to system metrics like CPU, network, and storage utilization.
Comments
Post a Comment