Do you know what is in your lake?
Data lakes allow you to bring a lot more type and volume of data and from new sources that never existed in the enterprise data warehouse. However, that also makes having a catalog, definition, metadata about the data in the lake even more important. Leverage DvSum’s model mapping and business glossary to catalog all your data and establish a common definition of data for various analytics needs from the lake.
Filter data before it enters the lake
DvSum data preparation platform lets you define powerful business and technical filter criteria. These filters allow you to ensure that the data entering into your lake is as clean as possible.
Keep the data lake clean
Just like with enterprise data, Data Lakes also start becoming dirty over time. The difference is that the volume of data that may be dirty, old or not relevant is significantly higher and can result in higher noise in your predictive analytics efforts. DvSum rules engine deployed as a production workflow acts like a self-driving submarine that continually identifies, scrubs and keeps the lake clean and relevant.
Sandbox it in DvSum Cluster. Deploy in your own
Use the DvSum connectors for Big Data to connect and automatically catalog your lake inventory, or drop in files directly into DvSum AWS S3 to define transformation steps and export your data for downstream needs. Re-use the transform script to run with new data anytime. Or deploy the transform as a production application to run within your own Data Lake.
Schedule 15min discovery call with DvSum
We'd love to discuss how we can help you get more value out of your data and show you our platform and technology.