Applied Machine Learing

Posts about Machine learning. The contents are mostly based on lecture CS498 Applied Machine Learning.

1 - Nearest Neighbors

less than 1 minute read

Suppose we are given data \(x\) that we want to find out label for. We can simply find the closest data from given data \({(x_1, y_1), ..., (x_N, y_N)}\) to ...

2 - Naive Bayes

5 minute read

Naive Bayes Classifier is a classifier out of a probability model. Assume we already know the posterior probability, \(p(y \mid x)\), where x is a vector wit...

3 - Support Vector Machine

6 minute read

Support Vector Machine is a linear model that can separate classes with a hyperplane. Suppose we are given a feature vector \(a\), and we need to classify th...

4 - Principle Component Analysis

3 minute read

Principal Component Analysis is a feature extraction method. It is useful for removing noise and avoiding the curse of dimension. Before we get into PCA, let...

5 - NIPALS

3 minute read

Last time, we talked about Principle Component Analysis. PCA requires to evaluate covariance matrix of data. And the problem is when the data gets big or has...

6 - Singular Value Decomposition

1 minute read

We’ve discussed PCA in previous posts. And the problem of PCA is that it is quite expensive to compute covariance matrix if the data set is in high dimension...

7 - Principal Coordinate Analysis

2 minute read

Visualization is a great tool to understand data. However, when it comes to visualize high-dimensional data, it is difficult to choose which dimensions to di...

8 - Canonical Correlation Analysis

4 minute read

In often cases, we want to know the relations between two data about one individual. For instance, we want to know how the image data is related with the wor...

9 - Clustering

2 minute read

In this post, we will discuss about clustering, and it will be mostly about K-means clustering. Clustering is mostly unsupervised method to group data which ...

10 - Expectation Maximization

9 minute read

Before get into General EM algorithm, we need to know two related ideas; Jensen’s Inequality and Kullback-Leibler Divergence.

11 - Mixture of Gaussian

8 minute read

In the last article, we learned how general Expectation Maximization algorithm works. Here, we will use EM algorithm to use Gaussian Mixture model. The progr...

12 - Latent Dirichlet Allocation

4 minute read

We have discussed about general Expectation Maximization algorithm and how it can be used in optimizing Gaussian Mixture Model. In this article, we will talk...

Multi-Label Classification

7 minute read

기본적인 머신러닝 문제들은 Regression이나 Classification으로 나뉘어진다. Classification 중에서도 하나의 분류에만 속하는 게 아니라 다중의 분류로 속하는 문제들도 있다. 이러한 종류를 multi-label classification이라고 한다.

Loss

less than 1 minute read

Hinge loss

Least Squares and Convexity

1 minute read

Least squares(최소자승법)은 해를 구하는 방법으로 미지수의 개수보다 식의 수가 많을 때 사용된다. 식은

Logistic Regression

1 minute read

분류 방법 중에서 일반적이고 간단한 logistic regression에 대해 정리해보았다.

TrueSkill 모델 (1)

3 minute read

목표 Trueskill 공부하면서 이해한 내용 정리 논문에 나오는 계산식을 추론하는 데 초점