top of page
Abstract Linear Background

VAST DataEngine

Breathing Life into Your Data

The VAST DataEngine is a global function execution engine that consolidates data centers and cloud regions into one global computational framework

The VAST DataEngine brings data to life in a thinking machine that can continuously process and learn on data from the natural world.

 

No more batch. No more silos of data processing. Just continuous, recursive computing.

 

Shipping in 2024, the VAST DataEngine will redefine the data computing paradigm by introducing serverless functions and real-time triggers into the VAST Data Platform. Once logic and state are merged... files, objects and tables come to life from edge to cloud.

 

The VAST Data Platform breaks the tradeoff between data streaming and global insight by engineering data processing and event notifications natively into the system.

Simplifying AI Pipeline Management - Adding Functions and Triggers to Data

The VAST DataEngine provides the execution and orchestration intelligence to manage and execute the function pipelines that let data scientists and deep learning practitioners scrape, transform, train, infer, and otherwise derive value from the files/objects and tables the VAST Data Platform holds without worrying about where, how, or possibly when, those functions are executed.

 

The VAST DataEngine automatically optimizes these pipelines to minimize cost, execution time, and/or system utilization to deliver a serverless execution environment across multiple on-premise and cloud locations.

​

With the VAST DataEngine – data, and changes to data, trigger action, action is then performed on the data, and the system processes recursively forever, turning all of your datacenters and the public cloud resources you give it access to, into an integrated thinking machine that takes data in, and delivers valuable insights.

 

The Data Engine is the basis for perpetual AI training and inference for the AI-powered discoveries of the future.

0.png

One Global Computational Framework

Global Compute Execution and Orchestration Across On-Prem, Cloud and the Edge

The VAST DataEngine is built on a container framework that allows for services to be globally executed across the VAST DataSpace, consolidating data centers and cloud regions into one global computational framework.

​

Built on a High-Performance Global Namespace

Runs On-Prem and in the Cloud

Your HW, Your Data

0.jpg

A Programmable Computing Engine in Software

The DataEngine is a containerized computing environment that customers deploy on their choice of CPUs, GPUs and DPUs – from edge to cloud.

 

By embedding logic directly into the VAST Data Platform, the system can schedule processing events in real time, triggered by data activities.

2.jpg

The VAST DataEngine Explained

The compute engine of the VAST Data Platform, the VAST DataEngine brings insights to life by adding functions and triggers to data, simplifying AI pipeline management.

DataEngine Programmable Environment via A Simple Python SDK

The VAST DataEngine is a serverless platform, programmed in Python, that integrates stateful functions into an exabyte-scale datastore and provides a programmable environment for developers.

 

By integrating streaming and data processing with an exabyte scale datastore and database, the Data Platform enables comprehensive function calling with minimal code.

3.png

Next-Generation Event Streaming Infrastructure

The VAST DataEngine features a new data streaming interface designed to write events natively into the VAST DataBase.

​

For the first time, it’s now possible to analyze all data by ingesting streaming data in realtime into VAST’s exabyte-scale transactional and analytical database.

4.jpg

A Real-Time Event Router

The VAST Event Router unifies unstructured and structured data event management into a common platform, providing event consumers simple tools to trigger action.

5.jpg

The VAST Data Platform is designed to create structure and insight from unstructured data

By storing triggers and functions as state in the VAST Data Platform, your code becomes dynamically managed by a global data store that supports global code versioning, global code distribution and global code security policies.

6.jpg

PDF - The VAST Data Platform White Paper

This comprehensive White Paper describes The VAST Data Platform in detail.

 

Pages 82 through 88 describes the VAST DataEngine in depth.

 

The VAST Data Platform is a breakthrough approach to data-intensive computing that serves as the comprehensive software infrastructure required to capture, catalog, refine, enrich, and preserve data through real-time deep data analysis and deep learning.

 

It is designed to provide seamless and universal data access and computing from edge-to-cloud, all from a platform that is designed for enterprises and cloud service providers to deploy on the infrastructure of their choosing.

content website.JPG

A New AI Dataset

Introducing the VAST DataSet

Deep learning data engineering is tough. Data engineers write large dataset files down to archive storage for training… creating a number of problems associated with rigid data management:

  • If model training requires data variation, new datasets are written down to storage, often creating redundant data because datasets use overlapping training example data.

​

  • Because conventional datasets are not embedded with training code, it can often be difficult to reproduce training models as data and code continue to evolve independently.

With the DataEngine, VAST is introducing a new concept called the VAST DataSet. This new approach to data management leverages the VAST Database to create materialized views of example data without copying and re-copying data into blunt data containers.

DataSets can scale to exabytes. Each DataSet includes an indexed set of examples and the code used for training so that it’s easy to reproduce models on the fly.

7.jpg
bottom of page