Big data in practice using Hadoop (EN/NL/FR)

Tijdsduur: 2 dagen

Big data in practice using Hadoop (EN/NL/FR)

ABIS

Nu inschrijven

Opleiderscore:

Tip: meer info over het programma, prijs, en inschrijven? Download de brochure!

Nu inschrijven Gratis brochure aanvragen

Startdata en plaatsen

Er zijn nog geen startdata bekend voor dit product.

Vraag informatie aan over startdata.
Bekijk gerelateerde producten mét startdata: Apache Hadoop, Sensoren en Big Data.

Beschrijving

Get hands-on practice on Linux with Apache Hadoop (HDFS, Yarn, Pig, and Hive) in this two-day ABIS training.

Nowadays everybody seems to be working with "big data", most often in the context of analytics and "Data Science". Do you also want to store and then interrogate your several data sources (click streams, social media, relational data, sensor data, IoT, ...) and are you experiencing the shortcomings of traditional data tools? Maybe you are in need of distributed data stores like HDFS and a MapReduce infrastructure like Hadoop's.

This course builds on the concepts which are set forth in the Big data architecture and infrastructure course. You will get hands-on practice on linux with A…

Lees de volledige beschrijving

Veelgestelde vragen

Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.

Nog niet gevonden wat je zocht? Bekijk deze onderwerpen: Apache Hadoop, Sensoren, Big Data, Data warehouse en Apache.

Get hands-on practice on Linux with Apache Hadoop (HDFS, Yarn, Pig, and Hive) in this two-day ABIS training.

This course builds on the concepts which are set forth in the Big data architecture and infrastructure course. You will get hands-on practice on linux with Apache Hadoop: HDFS, Yarn, Pig, and Hive.

You learn

how to implement robust data processing with an SQL-style interface which generates MapReduce jobs;
to work with the graphical tools which allow for easy follow-up of the jobs and the workflows on the distributed Hadoop cluster.

After successful completion of the course, you will have sufficient basic expertise to set up a Hadoop cluster, to import data into HDFS, and to interrogate it cleverly using MapReduce.

When you want to use Hadoop with Spark, you are referred to the course Big data in practice using Spark.

Intended for

Whoever wants to start practising "big data": developers, data architects, and anyone who needs to work with big data technology.

Backgroud

Familiarity with the concepts of data stores and more specifically of "big data" is necessary; see our course Big data architecture and infrastructure. Additionally, minimal knowledge of SQL, Linux and Java are useful. Experience with a programming language (e.g. Java, PHP, Python, Perl, C++ or C#) is a must.

Main topics

Motivation for Hadoop & base concepts
- The Apache Hadoop project and the components of Hadoop
- HDFS: the Hadoop Distributed File System
- MapReduce: what and how
- The workings of a Hadoop cluster
Writing a MapReduce program
- Implementing MapReduce drivers, mappers, and reducers in Java
- Writing Mappers and Reducers by use of an other programming or scripting language (e.g. Perl)
- Unit testing
- Writing partitioners for optimizing the load balancing
- Debugging a MapReduce program
Data Input / Output
- Reading and writing sequential data from a MapReduce program
- The use of binary data
- Data compression
Some frequently used MapReduce components
- Sorting, searching, and indexing of data
- Word counts and counting pairs of words
Working with Hive and Pig
- Pig as a high-level basic interface, which will generate a sequence of MapReduce jobs for us
- Hive as a high-level SQL-style interface, which generates a sequence of MapReduce jobs
The Parquet file format: structure and typical use; advantages of data compression; interoperability
Short introduction to HBase and Cassandra as alternative data stores

Training method

Classroom instruction, with practical examples and supported by extensive practical exercises.

Delivered as a live, interactive training – available in-person or online, or in a hybrid format. Training can be implemented in English, Dutch, or French.

Certificate

At the end of the session, the participant receives a "Certificate of Completion".

Duration
2 days.

Blijf op de hoogte van nieuwe ervaringen

Er zijn nog geen ervaringen.

Vraag informatie aan over deze cursus. Je ontvangt vanaf dan ook een seintje wanneer iemand een ervaring deelt. Handige manier om jezelf eraan te herinneren dat je wilt blijven leren!
Bekijk gerelateerde producten mét ervaringen: Apache Hadoop, Sensoren en Big Data.

Deel je ervaring

Heb je ervaring met deze cursus? Deel je ervaring en help anderen kiezen. Als dank voor de moeite doneert Springest € 1,- aan Stichting Edukans.

Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.

Big data in practice using Hadoop (EN/NL/FR)

Big data in practice using Hadoop (EN/NL/FR)

Download gratis en vrijblijvend de informatiebrochure

Heb je nog vragen?