114培訓(xùn)網(wǎng)歡迎您來到杭州博學(xué)國際教育培訓(xùn)中心!

400-850-8622

全國統(tǒng)一學(xué)習(xí)專線 8:30-21:00

杭州Cloudera Apache Spark程序員

授課機(jī)構(gòu):杭州博學(xué)國際教育培訓(xùn)中心

關(guān)注度:74

課程價(jià)格: 請咨詢客服

上課地址:請咨詢客服

開課時(shí)間:滾動(dòng)開班

咨詢熱線:400-850-8622

在線報(bào)名

課程詳情在線報(bào)名

更新時(shí)間:2025-01-13
Cloudera Apache Spark程序員 培訓(xùn)班型: 公開課,內(nèi)訓(xùn) 課程長度: 3天/18小時(shí) 培訓(xùn)日期: 待定 認(rèn)證考試: 暫無 培訓(xùn)地點(diǎn): 博學(xué)國際教育培訓(xùn)中心 環(huán)境要求: 投影儀、白板、大白紙 培訓(xùn)形式: 實(shí)例講授,現(xiàn)場演、練、及時(shí)溝通 培訓(xùn)資料: 培訓(xùn)教材 課程內(nèi)容 Cloudera Developer Training for Apache Spark 課程概述: 結(jié)合批處理、流媒體和交互分析技術(shù),利用 Apache Spark 構(gòu)建完整統(tǒng)一的大 數(shù)據(jù)應(yīng)用。學(xué)習(xí)編寫復(fù)雜的并行應(yīng)用程序,為各種用例、架構(gòu)和行業(yè)執(zhí)行快速良好的決策和實(shí)時(shí)行動(dòng)。 授課對象: 面向意欲優(yōu)化應(yīng)用程序速度、易用性和復(fù)雜程度的開發(fā)人員和工程師。培訓(xùn)對象要求 具 備Python或Scala背景知識,具備Linux 相關(guān)基礎(chǔ)知識更佳。 培訓(xùn)目標(biāo): Using the Spark shell for interactive data analysis ? The features of Spark’s Resilient Distributed Datasets ? How Spark runs on a cluster ? How Spark parallelizes task execution ? Writing Spark applications ? Processing streaming data with Spark 課程內(nèi)容: Introduction to Spark ? What is Spark? ? Review: From Hadoop MapReduce to Spark ? Review: HDFS ? Review: YARN ? Spark Overview Spark Basics ? Using the Spark Shell ? RDDs (Resilient Distributed Datasets) ? Functional Programming in Spark Working with RDDs in Spark ? Creating RDDs ? Other General RDD Operations Aggregating Data with Pair RDDs ? Key-Value Pair RDDs ? Map-Reduce ? Other Pair RDD Operations Writing and Deploying Spark Applications ? Spark Applications vs. Spark Shell ? Creating the SparkContext ? Building a Spark Application (Scala and Java) ? Running a Spark Application ? The Spark Application Web UI ? Hands-On Exercise: Write and Run a Spark Application ? Configuring Spark Properties ? Logging Parallel Processing ? Review: Spark on a Cluster ? RDD Partitions ? Partitioning of File-based RDDs ? HDFS and Data Locality ? Executing Parallel Operations ? Stages and Tasks Spark RDD Persistence ? RDD Lineage ? RDD Persistence Overview ? Distributed Persistence Basic Spark Streaming ? Spark Streaming Overview ? Example: Streaming Request Count ? DStreams ? Developing Spark Streaming Applications Advanced Spark Streaming ? Multi-Batch Operations ? State Operations ? Sliding Window Operations ? Advanced Data Sources Common Patterns in Spark Data Processing ? Common Spark Use Cases ? Iterative Algorithms in Spark ? Graph Processing and Analysis ? Machine Learning ? Example: k-means Improving Spark Performance ? Shared Variables: Broadcast Variables ? Shared Variables: Accumulators ? Common Performance Issues ? Diagnosing Performance Problems Spark SQL and DataFrames ? Spark SQL and the SQL Context ? Creating DataFrames ? Transforming and Querying DataFrames ? Saving DataFrames ? DataFrames and RDDs ? Comparing Spark SQL, Impala and Hive-on-Spark
姓名不能為空
手機(jī)號格式錯(cuò)誤