The * umBlog - worth knowing from the world of data and insights into our unbelievable company.

Data lakes: The bedrock of Big Data processing


Big Data is a general term for Industry 4.0, Internet of Things, Machine Learning, etc. For companies not to be left behind by competitors, but instead be able to hold their own or take a leading role in the market, they have to generate and collect a vast amount of data in order to continuously develop new (digital) business models.

On the whole, most companies are already processing large volumes of data. However, these volumes are commonly in single data silos that are interconnected either inadequately or not at all, making it impossible to filter out the right information and insights from the entire pool of data. Companies are throwing away too much potential.

Data lakes, not data swamps

The ever growing volumes of data from an increasing number of sources require suitably large storage capacities. This presents great challenges to existing infrastructures, often causing data to remain untapped. What is the solution, we hear you ask? Data lakes, we say.

In a data lake, data can be collected, stored, managed, protected and analyzed in one common storage platform – and that is precisely the reason why a data lake is essential for a modern IT and data infrastructure.

A data lake strategy is a holistic approach which merges data, applications and analyses. Structures and unstructured data are incorporated, cataloged, recorded and monitored in a large lake, independently of their source or destination. The process comprises three stages: exploration, optimization and transformation. 

The advantages of a data lake

There are many good reasons to have a data lake:

  • Data lakes are user friendly

Rather than having many different storage locations that are interconnected either inadequately or even not at all, a data lake is a central storage location for all applications. This enables capacity, performance and security to be scaled without additional complexity.

  • Data lakes enable flexibility

Data lakes enable companies to decouple IT components so that they can be scaled independently of each other. By using native mechanisms, data lakes allow open, standards-based access to data.

  • Data lakes are compatible

Data lakes work in conjunction with many applications, tools and technology generations. They do not depend on providers and platforms, and are compatible with Windows, Max, Linux, Unix or Hadoop.

  • Data lakes are efficient

Data lakes enable optimized utilization of all IT resources. They also require less storage space in datacenters and remove the need for data silos, parallel systems and duplications.

Up to date: Data lakes from EMC

Unlike standard providers of storage solutions, which offer only individual components of a data lake, EMC supplies a complete and fully flexible solution: various data lake platforms with industry-leading scale-out solutions for object and file storage platforms.

“The advantage of our solutions is that they work with file or object workloads, and therefore offer a simple, scalable, flexible and efficient answer to the demands of Big Data storage,” says Maurice Castillo, Senior Sales and Branch Office Leader at EMC.

Companies using EMC data lake solutions not only have their entire data in one central, secure storage location. They also have simplified access to the data – and the analyses of it – due to both the support of the industry standard protocol (Hadoop) and the certified integration of leading analysis providers such as Cloudera, Hortonworks, Splunk, IBM and many more.

Cooperation between *um and EMC

The unbelievable Machine Company (*um) has been working closely with EMC on storage technologies for many years. *um provides consultation services to companies integrating data lakes into their existing IT architectures and offers additional services such as operational assistance and 24/7 support, making it easier for the companies to implement and use the new storage technology.

What’s more, *um possesses great expertise in the field of data science. The company was recently acknowledged by Experton Group as a “Big Data Leader 2016” in the areas of “Big Data Consulting”, “Big Data Operations” and “Big Data Analytics-as-a-Service”.

Do you have any questions about data lakes and modern IT infrastructure? We are happy to answer them!


Social Media

Latest Blog Posts


The unbelievable Machine
Company GmbH
Grolmanstr. 40
D-10623 Berlin

+49-30-889 26 56-0 +49-30-889 26 56-11

Free Whitepaper

"Hadoop 2: How to realize big data projects successfully" (German version)

To Whitepaper Download

Working at *um:

Go to the Career Page