![Page 1: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/1.jpg)
大数据知识及技术简介
作者:李烨
![Page 2: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/2.jpg)
内容提纲
• 背景介绍
• 基础概念
• 大数据
• 大数据分析
• 相关技术
• 相关职位
• 社会影响
![Page 3: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/3.jpg)
信息过载
• 1880 美国人口普查
• 1941 “信息爆炸”
• 1944 Fremont Rider 发现:美国大学图书馆藏书每16年倍增
• 1961 DerekPrice 推进Rider的发现
![Page 4: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/4.jpg)
大数据时代
![Page 5: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/5.jpg)
基本概念
• 数据
• 数据可视化
• 数据分析
• 数据挖掘• Machine Learning
• 预测和建模
• “数据科学”
![Page 6: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/6.jpg)
大数据
• Volume:大量
• Velocity:高速
• Variety: 多样
• Value: 价值
![Page 7: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/7.jpg)
大数据分析
![Page 8: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/8.jpg)
大数据分析
• 与传统数据分析的区别• 运算追逐数据
• 生成同步处理
• 全体取代抽样
• 当前难点• 数据处理——处理大量、高速、多样的数据
• 数据分析——现有算法的并行化
![Page 9: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/9.jpg)
数据及数据分析相关职位
传统职位• 统计
• Business Intelligence
• 数据分析师(data analyst)
大数据相关职位• 算法研究(researcher)
• 数据科学家(data scientist)
• 数据工程师(data engineer)
• 数据保障(data quality)
![Page 10: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/10.jpg)
大数据技术
• 分布式存储 + 并行计算
• 云计算
![Page 11: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/11.jpg)
Hadoop
• HDFS + MapReduce
• Hadoop
![Page 12: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/12.jpg)
Hadoop Alternatives & Related
• Storm
• Spark
• Mahout
• SAS on Hadoop
• Mahout
• SAS on Hadoop
![Page 13: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/13.jpg)
NoSQL Database
• NOT ONLY SQL• MongoDB
• Redis
• Cassandra
• Hbase
• Run SQL on KeyValue Pair• Hive
• Pig
![Page 14: 大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)](https://reader031.vdocuments.net/reader031/viewer/2022012309/55a2022f1a28ab33268b4704/html5/thumbnails/14.jpg)
大数据的影响
• 定量分析
• 从必然到相关
• 信息安全