下载此文档

华盛顿大学公开课Introduction-to-Data-Science-003-this-course-1.pdf


文档分类:IT计算机 | 页数:约7页 举报非法文档有奖
1/7
下载提示
  • 1.该资料是网友上传的,本站提供全文预览,预览什么样,下载就什么样。
  • 2.下载该文档所得收入归上传者、原创者。
  • 3.下载的文档,不会出现我们的网址水印。
1/7 下载此文档
文档列表 文档介绍
This Course
tools abstr.
desk cloud
Bill Howe, UW 2tools abstr.
What are the abstractions of
data science?
“Data Jujitsu”
“Data Wrangling” Translation: “We have no idea what
“Data Munging” this is all about”

4/28/13 Bill Howe, UW 3tools abstr.
What are the abstractions of
data science?
matrices and linear algebra?
relations and relational algebra?
objects and methods?
files and scripts?
data frames and functions?
4/28/13 Bill Howe, UW 4Data Access Hitting a Wall desk cloud
Current practice based on data download (FTP/GREP)
Will not scale to the datasets of tomorrow
•  You can GREP 1 MB in a second •  You can FTP 1 MB in 1 sec
•  You can GREP 1 GB in a minute •  You can FTP 1 GB / min (~1$)
•  You can GREP 1 TB in 2 days
•  You can GREP 1 PB in 3 years. •  … 2 days and 1K$
•  … 3 years and 1M$
•  Oh!, and 1PB ~5,000 disks
•  At some point you need
indices to limit search
parallel data search and analysis
•  This is where databases can help
[slide src: Jim Gray] 5hackers analysts

华盛顿大学公开课Introduction-to-Data-Science-003-this-course-1 来自淘豆网www.taodocs.com转载请标明出处.

相关文档 更多>>
非法内容举报中心
文档信息
  • 页数7
  • 收藏数0 收藏
  • 顶次数0
  • 上传人wxc6688
  • 文件大小704 KB
  • 时间2022-07-01