דף הבית » Big Data » Big Data and Spark Revolution

Big Data and Spark Revolution

Description:

The continued rise in the volume and diversity of available data presents both opportunities and challenges for businesses to source information that is relevant, targeted and timely.

Big Data uniquely enables us to see, understand and be aware of the competitive landscape in new and increasingly detailed ways. Leveraging this data can enable organizations to target markets, engage with new prospects, compete more effectively, and close sales.

One of the most prominent framework for processing big data is Apache Hadoop, which is developed for distributed processing of large data sets across clusters of computers. The framework is written in Java, but a language binding exists for most of the commonly used languages.

This seminar will provide you with introduction to Big Data Technologies and help you identify the benefits of Big Data for your business.

In addition, we’ll introduce Apache Spark and related projects. Apache Spark solves the problem of speed and versatility by offering an “open source data analytics cluster computing framework” and it offers a framework that supports different types of data analysis within the same technology stack: fast interactive queries, streaming analysis, graph analysis and machine learning.

Who Should Attend

The primary audience for this course are Architects, Developers, CTOs, Engineering Managers, etc. No prior Hadoop experience is required.

Required Skills

Knowledge and experience with RDBMS and Information systems

Course Contents

Demystifying Big Data

History of Database Systems
Exploration of Data
The CAP Theorem
Replication, Clustering and Sharding
Cloud Computing

Hadoop

Introduction to Hadoop
Hadoop Origin
Hadoop Distributed File System (HDFS)
Distributed Parallel Processing (MapReduce)
YARN (Yet Another Resource Negotiator)
Hadoop Eco-System
SQOOP
Flume
ZooKeeper
Oozie
Other Projects
Kafka
Parquet
Avro

NoSQL

Key-Value Store
Hbase
Cassandra
DynamoDB
Document Store
MongoDB
CouchBase
Graph Databases
Neo4J

In-Memory Technologies

Redis
VoltDB
Apache Spark
SAP HANA
Oracle In-Memory Database

Search, Indexing and Log Analysis

ElasticSearch
Splunk

YesSQL!

Hive
Impala
Drill
Phoenix
Couchbase N1QL
Spark SQL

Oracle Lines Up

Big Data Appliance
Big Data SQL
JSON API
Oracle sharding
Spatial and Graph

Spark Introduction

Big Data concepts
Eco-system Overview: Sqoop, Impala, Hive, Flume
Spark Basics
Spark Data Model and Operations: RDDs, Actions, Transformations
Spark Execution Models Overview: YARN, Spark Standalone
DataFrames, Spark SQL, MLlib
Spark Streaming

מרצה
דוד יהלום

דוד יהלום מרצה

דוד הינו CTO בנאיה טכנולוגיות ומרצה בכיר בנאיה אקדמי. דוד הינו מומחה בתחום מסדי נתונים ומומחה Big Data

על פי דרישה מועד פתיחה
9:00-16:30ימים ושעות
16 שעות אקדמיות
מתקדםרמת הקורס
עברית/Englishשפת הדרכה
לבדיקת התאמה לקורס

ממליצים
לפתיחה והורדת סילבוס

Call Now Button