Data Related Concepts

Data Lake

A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. It is useful for organizations in regulated industries like healthcare or finance, which need to store huge datasets of private or sensitive user data. The main challenge with a data lake architecture is that raw data is stored with no oversight of the contents. According to Ventana Research, data lake solutions can capture the growing volume, velocity and variety of data to support the rapid growth of technologies such as artificial intelligence (AI), machine learning (ML), and the internet of things (IoT). According to an IBM’s report, 62% of data lake users deploy their data lakes on Hadoop.