编辑推荐
对于认真拥抱大数据机遇的人而言,这是一本必读书。
内容简介
这是一本博大精深但又不太技术的指南,向你介绍数据科学的基本原则,并带领你全程浏览从所搜集数据中抽取有用知识和商业价值所必需的“数据分析思维”。通过学习数据科学原则,你将领略当今用到的诸多数据挖掘技巧。更重要的是,这些原则支撑着通过数据挖掘技巧解决商业问题所需的手段和策略。
精彩书评
“本书chao yue了数据分析基础。这是为我们中的一部分人(也许是全部)准备的重要指南,他们的业务基于无处不在的数据机遇和数据驱动决策的新体制而设。”
—— Tom Phillips(Dstillery CEO,前Google搜索和分析业务主管)
“两位作者早在‘数据科学’这个名词出现之前就是该领域的知名专家,他们拿下了一个复杂的主题并且将它变得晓畅通俗。这是第1本此类著作,专注于将数据科学概念应用于实际的商业问题。它被自由地挥洒在引人注目的现实世界的例子中,概述了商业世界中熟悉而易于获取的问题:客户流失、有针对性的营销,甚至是威士忌分析!
这本书是独yi无er的,因为它不是给出算法的详细指南,而是帮助读者理解数据科学背后的基本概念,重要的是如何在解决问题时取得成功。无论您正在寻找数据科学的全面综述,还是需要基础知识的新兴数据科学家,这本书都是必读的。”
—— Chris Volinsky(AT&T实验室统计研究总监,奖金达百万美元的Netflix挑战赛获奖者)
“数据是生产力增长、创新和更丰富的客户洞察力新浪潮的基础。直到最近才被广泛地视为竞争优势的来源,处理好数据正在迅速成为停留在游戏中的筹码。作者的深刻应用经验成为观察你的竞争对手策略的一个窗口。”
—— Alan Murray(连续创业者,Coriolis Ventures合伙人)
目录
Preface
1.Introduction: Data-Analytic Thinking
The Ubiquity of Data Opportunities
Example: Hurricane Frances
Example: Predicting Customer Churn
Data Science, Engineering, and Data-Driven Decision Making
Data Processing and "Big Data"
From Big Data 1.0 to Big Data 2.0
Data and Data Science Capability as a Strategic Asset
Data-Analytic Thinking
This Book
Data Mining and Data Science, Revisited
Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
Summary
2.Business Problems and Data Science Solutions
From Business Problems to Data Mining Tasks
Supervised Versus Unsupervised Methods
Data Mining and Its Results
The Data Mining Process
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Implications for Managing the Data Science Team
Other Analytics Techniques and Technologies
Statistics
Database Querying
Data Warehousing
Regression Analysis
Machine Learning and Data Mining
Answering Business Questions with These Techniques
Summary
3.Introduction to Predictive Modeling: From Correlation to Supervised Segmentation.
Models, Induction, and Prediction
Supervised Segmentation
Selecting Informative Attributes
Example: Attribute Selection with Information Gain
Supervised Segmentation with Tree-Structured Models
Visualizing Segmentations
Trees as Sets of Rules
Probability Estimation
Example: Addressing the Churn Problem with Tree Induction
Summary
4.Fitting a Model to Data
Classification via Mathematical Functions
Linear Discriminant Functions
Optimizing an Objective Function
An Example of Mining a Linear Discriminant from Data
Linear Discriminant Functions for Scoring and Ranking Instances
Support Vector Machines, Briefly
Regression via Mathematical Functions
Class Probability Estimation and Logistic "Regression"
Logistic Regression: Some Technical Details
Example: Logistic Regression versus Tree Induction
Nonlinear Functions, Support Vector Machines, and Neural Networks
5.Overfitting and Its Avoidance
6.Similarity, Neighbors, and Clusters
7.Decision AnalyticThinking h What Is a Good Model?
8.Visualizing Model Performance
9.Evidence and Probabilities
10.Representing and Mining Text
11.Decision Analytic Thinking Ih Toward Analytical Engineering
12.Other Data Science Tasks and Techniques
13.Data Science and Business Strategy
14.Conclusion
A.Proposal ReviewGuide
B.Another Sample Proposal
Glossary
Bibliography
Index
商业数据科学(影印版) [Data Science for Business] 电子书 下载 mobi epub pdf txt