머신러닝 개념을 목차별로 정리해보았다. 책 "Hands-On Machine Learning with Scikit-Learn & tensorFlow" 목차를 참조하면서도 내가 아는 최신 개념도 몇개 추가로 목차에 넣었다. 특히 XGBoost 의 경우, 현재 캐글에서 높은 performance를 보이는 알고리즘 중 하나이며, Graph-based clustering과 Density-based clusetering 은 빅 데이터를 다루는데 점점 많이 쓰이는 알고리즘이다.
1. 지도학습 (Supervised learning)
- with labeled data
- For Classification or Regression
- Decision tree
- Random forests
- Support vector machines (SVM)
- K-Nearest Neighbors
- Linear regression
- Logistic regression
- Neural networks
- XGBoost
2. 비지도학습 (Unsupervised learning)
- with unlabeled data
- for Clustering
- K-mean
- Hierarchical cluster
- Expectation maximization
- Graph-based clustering
- Density-based clustering
- for visulaization & dimensionaliy reduction
- PCA
- kernal PCA
- Locally-linear Embedding
- tSNE
- UMAP
- for Association rule learning
- Apriori
- Eclat
3. 준지도학습 (Semi-supervised learning)
- with a small amount of labeled data and a large amount of unlabeled data
- 예) Deep Neural Network
4. 강화학습 (Reinforcement learning)
- 예) AlphaGo of DeepMind
참조
책 "Hands-On Machine Learning with Scikit-Learn & tensorFlow" 한빛미디어 (2018)