Science, asked by poojamallick329, 2 months ago

What is an advantage of storing data in a Data Lake, without
applying a specific schema to it initially?
It allows more flexibility to use the data in various innovative ways.
It saves both developer time and company money by never having to design a schema.
It avoids corrupting the data by working with it before there is a clearly defined ne
It makes working with the data faster as data lakes are more efficient.

Answers

Answered by vivekbt42kvboy
49

Explanation:

A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data, and run different types of analytics—from dashboards and visualizations to big data processing, real-time analytics, and machine learning to guide better decisions.

data lake diagram

Why do you need a data lake?

Organizations that successfully generate business value from their data, will outperform their peers. An Aberdeen survey saw organizations who implemented a Data Lake outperforming similar companies by 9% in organic revenue growth. These leaders were able to do new types of analytics like machine learning over new sources like log files, data from click-streams, social media, and internet connected devices stored in the data lake. This helped them to identify, and act upon opportunities for business growth faster by attracting and retaining customers, boosting productivity, proactively maintaining devices, and making informed decisions.

Data Lakes compared to Data Warehouses – two different approaches

Depending on the requirements, a typical organization will require both a data warehouse and a data lake as they serve different needs, and use cases.

A data warehouse is a database optimized to analyze relational data coming from transactional systems and line of business applications. The data structure, and schema are defined in advance to optimize for fast SQL queries, where the results are typically used for operational reporting and analysis. Data is cleaned, enriched, and transformed so it can act as the “single source of truth” that users can trust.

A data lake is different, because it stores relational data from line of business applications, and non-relational data from mobile apps, IoT devices, and social media. The structure of the data or schema is not defined when data is captured. This means you can store all of your data without careful design or the need to know what questions you might need answers for in the future. Different types of analytics on your data like SQL queries, big data analytics, full text search, real-time analytics, and machine learning can be used to uncover insights.

As organizations with data warehouses see the benefits of data lakes, they are evolving their warehouse to include data lakes, and enable diverse query capabilities, data science use-cases, and advanced capabilities for discovering new information models. Gartner names this evolution the “Data Management Solution for Analytics” or “DMSA.”

Characteristics Data Warehouse Data Lake

Data Relational from transactional systems, operational databases, and line of business applications Non-relational and relational from IoT devices, web sites, mobile apps, social media, and corporate applications

Schema Designed prior to the DW implementation (schema-on-write) Written at the time of analysis (schema-on-read)

Price/Performance Fastest query results using higher cost storage Query results getting faster using low-cost storage

Data Quality

Highly curated data that serves as the central version of the truth Any data that may or may not be curated (ie. raw data)

Users Business analysts Data scientists, Data developers, and Business analysts (using curated data)

Analytics Batch reporting, BI and visualizations Machine Learning, Predictive analytics, data discovery and profiling

Answered by dharanikamadasl
9

Answer:

Option - It saves both developer time and company money by never having to design a schema is an advantage of storing data in a Data Lake, without applying a specific schema to it initially

Explanation:

  • The key benefit of a data lake is that you can store any data in one place incurring a low cost, pulling it as analytical needs arise.

Advantages of storing data in a Data Lake:

Data storage in native format:

  • The need for data modelling at the moment of ingesting is removed by a data lake.
  • When looking for and investigating data for analytics, we can do this.
  • It provides unequalled flexibility for gathering insights on any business or domain-related inquiry.

Scalability

  • As we consider scalability, it provides scalability and is reasonably priced when compared to a conventional data warehouse.

Versatility

  • Multiple sources of multi-structured data can be stored in a data lake.
  • Logs, XML, multimedia, sensor data, binary, social data, chat, and people data can all be stored in a data lake, to put it simply.

Flexibility in Schema

  • In the past, schema required that the data be in a particular format.
  • This is excellent for OLTP (Application Data) because it verifies data before entering.
  • However, it's a barrier for analytics because we want to use the data as it is.
  • Schema-based solutions are used in traditional data warehouses.
  • However, the Hadoop data lake enables you to establish various schemas for the same data or to be schema-free.
  • In essence, it makes it possible to separate schema from data, which is great for analytics.

Hence, storing data in a Data lake is more advantageous than storing it in a Data warehouse.

#SPJ3

Similar questions