Insights
Portfolio Spotlight: Bluware Volume Data Store (VDS)
As part of our Bluware portfolio spotlight, Andy James, Chief Product Officer at Bluware, talks us through the Bluware Volume Data Store (VDS), a powerful storage format that can support previously unthinkable application workflows and end-user experiences.
INTRODUCTION
Bluware Volume Data Store (VDS) is a powerful and flexible storage format for signal data. It was conceived more than 15 years ago to overcome the inherent limitations in existing formats, which were designed for tape storage. The architecture adopts concepts developed in the gaming industry where interactive performance is critical. VDS can support previously unachievable application workflows and end-user experiences.
While VDS is generally known for storing seismic data, it is a generic storage system that is built to store any signal data types. For seismic data, VDS can store 2D, 3D, stacked, and pre-stack (migrated and raw gathers) seismic data.
This document describes the capabilities of the storage format and details the difference between the two available implementations, the open-source OpenVDS library and the commercial Bluware VDS implementation.
VDS CAPABILITIES
VDS has many capabilities which are combined to provide flexibility and performance. It is important to understand these capabilities and the value that each capability delivers. This will also help define the differences between VDS, which is part of Bluware’s commercial framework and OpenVDS which is available as open-source through the Open Subsurface Data Universe (OSDU). For simplicity, these have been broken into three lifecycles: writing, storing and reading.
WRITING VDS
When writing data into VDS format, there are several options that can be controlled by the user. These options allow the user to make trade-offs between storage-space and performance based on the expected use case scenario. For example, will the dataset be used for interactive visualization or is it intended for archival storage?
Compression Mode
When importing to VDS format, different compression modes and qualities can be applied to the signal data. These may provide significant savings in storage space.
Adaptive Compression (Bluware VDS Only)
Adaptive compression is based on a bi-orthogonal wavelet transform where the actual compression is encoding coefficients in an adaptive way. This gives results equal to a wavelet soft-threshold noise removal filter. The practical implication of this is that selecting a higher compression ratio serves to make the data smoother with less noise. This can be used strategically to make some workflows work better, such as horizon interpretation, which will work better on smoother data.
There are several quality presets available. Some notable presets are:
Near lossless compression is approximately 90 percent smaller than the original data, yet it can be used for most interactive workflows and visualization.
Lossless compression is a unique capability. It applies the highest quality preset, virtual lossless, to the data. In addition, the difference between the original input data and the virtual lossless compression is stored. This enables the VDS to be converted back to a SEG-Y file that is binary (bitwise) equal to the original input SEG-Y. This capability is excellent for data preservation as it provides the benefit of approximately 25 percent reduction in space with the confidence of preserving the original SEG-Y exactly.
Uncompressed (OpenVDS and Bluware VDS)
VDS can be created without applying compression and storing the signal data in raw format. Even in uncompressed format, there is support for constant-regions or regions without any data, which can lead to some storage reductions.
Bricked Format (OpenVDS and Bluware VDS)
Signal data in VDS format is organized into bricks, which by default are cubes of 128x128x128 sample points, with 4 voxel overlaps to avoid blocking artifacts. 2-dimensional seismic data is organized into slices of 480×480 sample points. Optionally, data imported into VDS format might duplicate the data in both 3D-bricks and 2D-slices. This allows for extremely fast slice-based access.
Multiple Level of Detail (OpenVDS and Bluware VDS)
When importing data, it is possible to generate decimated versions of the data at lower resolution. These decimated datasets are much smaller than the input data and are useful during visualization, since less data must be read in order to visualize a zoomed-out view of the entire dataset.
Multiple Channels (OpenVDS and Bluware VDS)
The VDS format enables up to six-dimensional sample data, with an unlimited number of auxiliary channels. For example, 4D seismic data would be stored with dimensions including inline, crossline, time-slice, and time with a primary channel with amplitudes and auxiliary channels such as original trace headers. The channels might have different compression settings, which allows for storing the original trace headers using zip compression and adaptive compression on the signal data.
STORING VDS
VDS can be stored in two forms including file based and cloud-based object store such as AWS S3 or Microsoft Azure BLOB store.
While the individual bricks inside a VDS file are identical to the bricks stored in a cloud-based object-store, they are organized differently. Moving between file-based and object-store based is a simple conversion process.
File-Based VDS (Bluware VDS Only)
The file based VDS exists on disk as a single file with the extension ‘.vds’. This is used for non-cloud deployments and for data exchange in VDS format. These files are accessed and managed as regular files on disk.
Cloud-Based Object-Store VDS (OpenVDS and Bluware VDS)
Object storage architecture was designed to allow for massive amounts of data to be stored without the complexities involved in managing traditional networked file systems. Object storage delivers secure, high performance access to data over the internet. Data might potentially be duplicated in multiple geographical regions allowing for fast access from different continents.
When storing VDS on object stores (such as AWS S3 or Azure BLOB store), each brick is stored as a separate object with a common prefix.
Serverless Architecture (OpenVDS and Bluware VDS)
One of the key benefits of VDS on object store, is that data is served without the need for a dedicated server. Using our APIs, data on object stores is accessed just like a regular file based VDS. This is a significant benefit in terms of cost, particularly in the cloud, where there is a cost associated to compute capability.
READING VDS
The VDS format provides many powerful capabilities which are used to read signal-based data.
Read Compressed VDS (OpenVDS and Bluware VDS)
Any VDS, whether created using commercial Bluware VDS with adaptive compression, or OpenVDS can be read by either version. While writing data using adaptive compression is exclusive to Bluware VDS, both versions support reading data stored with adaptive compression. This is important for data exchange between companies using Bluware commercial versions and those using the non-commercial OpenVDS version. Data in VDS format, whether compressed or not, is fully transferrable with no license requirement.
Adaptive Streaming (OpenVDS and Bluware VDS)
VDS is stored in such a way that any signal quality lower than the signal quality used to create the VDS can be read via an API call. As an example, this allows for storing data using lossless compression, but only reading data in near lossless format. As a practical example, to display a time slice on a mobile device, very little data is needed within the viewer to meet the visual capabilities of the eye. Whereas a calculation may require much higher quality. Both applications can be served from the same VDS through adaptive streaming without additional copies of the data.
Adaptive streaming increases performance of applications as less data is sent through networks to the client.
Random Access Reading (OpenVDS and Bluware VDS)
Unlike the SEG-Y format, which stores data sequentially, VDS is a brick-based format. This allows for data to be read in any direction without performance degradation. Whether reading an inline, crossline, time-slice, arbitrary line, or a cube, VDS delivers exceptional performance.
GPU Enabled (Bluware VDS Only)
When data is read from a compressed VDS, it is decompressed on the client computer. If the computer has a CUDA capable GPU, the GPU is utilized for decompression and the data can be stored directly in GPU memory. Bluware leverages this architecture in application in Bluware Interactive Deep Learning, which uses GPUs for training and inference.
The use of GPUs for decompression turns the adaptive compression into an acceleration technology as it is faster to read and decompress the data than it would be to read the uncompressed data directly from disk or over network.
Multi-Threading (OpenVDS and Bluware VDS)
VDS workflows, such as decompression on the client system, have been optimized to make use of multiple CPU cores if available. The result is increased performance and an interactive experience.
Compute (Bluware VDS Only)
Bluware has developed a powerful framework to perform complex computations on-the-fly. Compute-plugins are developed in C++ and CUDA (for GPU processing). A compute will read a VDS and manipulate the data before sending the result as a VDS into another compute step or visualization workflow. An example of a compute plug-in could be a spectral decomposition process. These powerful compute plug-ins can be chained together to achieve complex workflows on the fly. The compute framework enables applications to be developed or services to be built to create powerful applications.
API Access (OpenVDS and Bluware VDS)
APIs to import and read VDS are available in Python and C++. In addition, Bluware VDS supports .NET and Java.
OPENVDS
Bluware’s technologies have always been fully open and extendable. The philosophy of OSDU resonates very well in that regard, and our decision to open-source and contribute an open-source VDS implementation to OSDU was easy: It is a unique opportunity to get the whole industry behind the most efficient and cost-effective way to store and use seismic data.
OpenVDS is a set of tools and an open-source API to read and write data in VDS format. The read and write capabilities of the OpenVDS implementation are not as complete as those in the commercial version, however they do offer interchangeability, i.e. there is no conversion process; the data is the same and can be read by either version. When compared to other formats for seismic data, OpenVDS offers powerful capabilities and a significant step forward for the industry.
Find out more about our portfolio of forward thinking technology companies here.