內容大鋼
Java是從事實踐工作的數據科學家的主力語言,不少Hadoop生態系統都基於Java,數據科學領域中大多數生產系統絕對都是用其編寫的。如果你了解Java,烏代·卡馬特、克里希納·查普佩拉著的這本《精通Java機器學習(影印版)(英文版)》就是你邁向成為數據科學高級從業者的下一步。
本書旨在為你介紹機器學習領域的一系列先進技術,包括分類、聚類、異常檢測、流學習、主動學習、半監督學習、概率圖建模、文本挖掘、深度學習、大數據批處理以及流機器學習。每章都附有說明性示例和真實案例研究,展示如何使用合理的方法和當前最好的Java工具來運用新學到的技術。
閱讀完本書後,你將理解構建能夠解決任何領域中的數據科學問題的強大機器學習模型所需的工具和技術。
目錄
Preface
Chapter 1: Machine Learning Review
Machine learning - history and definition
What is not machine learning?
Machine learning - concepts and terminology
Machine learning - types and subtypes
Datasets used in machine learning
Machine learning applications
Practical issues in machine learning
Machine learning - roles and process
Roles
Process
Machine learning -tools and datasets
Datasets
Summary
Chapter 2: Practical Approach to Real-World Supervised Learning
Formal description and notation
Data quality analysis
Descriptive data analysis
Basic label analysis
Basic feature analysis
Visualization analysis
Univariate feature analysis
Multivariate feature analysis
Data transformation and preprocessing
Feature construction
Handling missing values
Outliers
Discretization
Data sampling
Is sampling needed?
Undersampling and oversampling
Training, validation, and test set
Feature relevance analysis and dimensionality reduction
Feature search techniques
Feature evaluation techniques
Filter approach
Wrapper approach
Embedded approach
Model building
Linear models
Linear Regression
Naive Bayes
Logistic Regression
Non-linear models
Decision Trees
K-Nearest Neighbors (KNN)
Support vector machines (SVM)
Ensemble learning and meta learners
Bootstrap aggregating or bagging
Boosting
Model assessment, evaluation, and comparisons
Model assessment
Model evaluation metrics
Confusion matrix and related metrics
ROC and PRC curves
Gain charts and lift curves
Model comparisons
Comparing two algorithms
Comparing multiple algorithms
Case Study - Horse Colic Classification
Business problem
Machine learning mapping
Data analysis
Label analysis
Features analysis
Supervised learning experiments
Weka experiments
RapidMiner experiments
Results, observations, and analysis
Summary
References
Chapter 3: Unsupervised Machine Learninq Techniques
……
Chapter 4: Semi-Supervised and Active Learning
Chapter 5: Real-Time Stream Machine Learning
Chapter 6: Probabilistic Graph Modeling
Chapter 7: Deep Learning
Chapter 8: Text Mining and Natural Language Processing
Chapter 9: Bia Data Machine Learnina - The Final Frontier
Appendix A: Linear Algebra
Appendix B: Probability
Index